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Preface 


“This Guide introduces the many worlds of philosophical logic. Or perhaps I should 
say, many of the worlds of philosophical logic, for it cannot pretend completeness. 
“That would be impossible. Nevertheless, these 20 chapters present a central core of 
what constitutes philosophical logic today, and they provide a solid foundation for 
further study. 1 will say more in the Introduction about what philosophical logic is, 
and about the selection and arrangement of the chapters. 

Each of these chapters is newly written for this volume by a distinguished scholar 
in fts subject area. Their purpose is to provide the reader with basic knowledge of 
the current state of that aspect of philosophical logic, including its concepts, motiva- 
tions, methods, major results, and even applications. Each chapter is independent 
of the others, so they can be read in any order or selected to suit different interests, 
Thave, however, included cross-references among the chapters since their subjects 
often Overlap. (Each chapter was also written independently of the others, and so 
their authors might have different views about common subjects.) 

This volume should be accessible and useful to anyone interested in philosophical 
logic, expert and non-expert alike. It could form the basis for a general course on 
philosophical logic, or it could serve as a supplementary resource and reference work, 
for the study of its specialized topics. Experienced logicians will discover sufficient 
substance here to occupy their attention, while the general reader who merely wants 
to know what a subject is about will find a definitive introduction to that field, 
Philosophical logic is recommended not only for philosophers and logicians. These 
days, it is also of great importance for research in computer science, cognitive 
science, artificial intelligence (Al), and theoretical linguistics; and, like all logic, it 
belongs hand in hand with mathematics. 

Logic is a technical discipine, with its specialized language, notation, and methods, 
‘As a result, a reader would benefit from having had a first course in formal logic, ot 
from having studied any of countless elementary texts, and to have some familiarity 
with the logician’s language and techniques. Even so, the chapters here presume 
little and explain much, so that even the uninitiated reader should profit from them. 

A logician’s language uses a lot of special symbols. But different logicians often 
use different symbols for the same purposes, and sometimes different logicians use 
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the same symbols for different purposes. I have not tried to impose a uniform nota- 
tion or style of expressing formal concepts on these chapters, for I believe it is better 
for students of logic to become familiar with a variety of styles since that is what they 
will meet in the literature beyond this book. The authors of these chaprers do, 
however, explain their symbols and notations as they introduce them, and readers 
should be able to understand new pattems without difficulty and lean to move 
among them effortlessly. 


have profited much from working with the authors of these chapters; I learned 
4 ot from every one of them ~ as, I'm sure, readers will too ~ and our collaboration 
‘was always a pleasure for me. My heartfelt gratitude goes to them all. This is their 
volume more than it is mine. Throughout the development and completion of this 
project, I have also been very gratified by how encouraging, generous and helpful 
everyone was with whom T discussed it. In addition to the authors of the chapters, 
1 would especially like to thank Nuel Belnap, Johan van Benthem, Mark Brown, 
Brian Chellas, Mike Dunn, John Etchemendy, Dov Gabbay, Emie Lepore, Penelope 
Maddy, Don Nute, Alasdair Urquhart, Bas van Fraassen, and many others, Thanks 
too to Steve Smith and Beth Remmes of Blackwell Publishers, to Steve for launch- 
ing the project, to Beth for piloting it into port, with wise advice and wonderful 
patience. Thanks also to Jenny Lawson and her First-Class crew for manning the 
rigging. 

Pethaps my greatest debt in this area, though, is to Alan Ross Anderson and Nucl 
. Belnap, Jr. for introducing the many worlds of philosophical logic to me a long 
time ago. 
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What is philosophical logic? Philosophical logic is philosophy that is logic, and logic 
that is philosophy. It is where philosophy and logic come together and become one 
Philosophical logic is not a special kind of logic, some species distinct from math, 
ematical logic, symbolic logic, formal logic, informal logic, modem logic, ancient 
logic, or logic with any other familiar modifier. There is only logic, Logic is the 
theory of consequence relations, of valid inferences, AS such, it can be investigared 
sod prsoted ia many ways, athough the mathematical methods of moder foal 


the sorts of logic that hold greatest interest for philosophers. 

<evelops formal systems and structures 10 be applied to the analysis of concepts and 

arguments that are central to philosophical inquiry. So, for example, such traditional 

philosophical concepts as necessity, knowledge, obligation, time and existence, not 
are 





. incding 
the formal languages of logic itself, and this resounds throughout philosophy. By 
the same token, many of the developments within philosophical logic have been 
‘motivated by broad philosophical concerns. Intuitionistic logic reflects a particular 
Perspective on the mature of judgment and truth. Many-alued logic grew out of 


Thus lox ppt piknopty, and philosophy feeds logic. They join. The result 
is philosophical logic. The chapters that follow present the basic formal methods, 
models and systems that have been developed in this area. Their emphasis, how- 
«ver, is on the presentation of the fundamental logical structures as such, as well 
as their motivations, rather than their application to specific philosophical questions. 
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Nevertheless, that these structures do have philosophical application is a common 
theme running through all the chapters. 

1 would distinguish philosophical logic, as itis presented in this volume, from 
‘other, related enterprises that also sometimes go under the name of ‘philosophical 
logic’. I would distinguish it from the philosophy of language, with its pursuit of 
the concepts of meaning and reference, naming and predication, of the structure 
Cf propositions, questions of analytcity, the nature of speech acts, etc. I would also 
distinguish it from the philosophy of logic, with its investigation of the epistemo- 
logical and ontological positions of the propositions of logic, of a priority, of con- 
ventionalty, of questions about what, if anything, is a logical constant, about the 
nature of logical truth and logical consequence (though see chapter 6), and even the 
‘question ‘What is Logic? itself. Inevitably, of course, the questions and concerns 
of all these areas ~ philosophical logic, philosophy of language, philosophy of 
logic ~ intertwine, and it would be a mistake to try to establish real boundaries 
between them. 

Philosophical logic is easily seen as logic for philosophy. Itis important, however, 
also to recognize that it has other applications in other disciplines as well. Today, 
much of the most flourishing research in philosophical logic is being done by 
computer scientists, working, for example, on aspects of knowledge representation, 
system verification, or AI. Several of the chapters here are written by computer 
scientists, The results of philosophical logic are also useful in cognitive science and 
theoretical linguistics. This volume ends with chapters linking logical investiga 
tion to inquiry into the structure of language, including natural language. And, we 
must not forget, logic is « part of mathematics, and #0 property of interest to 
mathematicians. 








‘The term ‘logic’ as Wilfid Hoxiges observes a the start of chapter 1, is ambiguous, 
It can refer to collections of languages that possess particular structures, of to the 
study of the nules of sound arguments, which occur in ordinary, natural language. The 
latter is how many might think of logic fist. In the chapters here, however, the primary 
focus will be on logic in the first sense; this provides the entryway to the other, 
Logic begins with language, and so the chapters that follow typically include the 
specification of the languages of their investigation (or else they will explicitly pre- 
suppose that the language is given, taking for granted that its grammar is well enough 
known, pethaps, for example, rom earlier chapters). The results determined for such 
constructed languages, and their consequence relations, will then extend to natural 
languages insofar as our natura languages, or portions of them, share the structures 
‘of the logicians’ constructed languages. Then we come to logic in the second sense, 
to the analysis and evaluation of propositions and arguments in the languages we use. 

‘The expressions of a language, especially its well-formed formulas (wf), can stand 
in many relations, and these relations can be described in different ways, which 
reflect different ways of looking at logic and logical consequence. Some relations 
‘ean be characterized syntactically, in terms of the grammatical structures of the 
‘expressions themselves. Then we often speak of ‘logical syntax’. Of central impor: 
tance are relations of derivability or deducibility, of when an expression can be 
(syntactically) drawn from others; here one would typically speak of a proof A 
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formula, C,is derivable from a set of formulas, Tif there is a proof of C from T 
here a proof is a structure of formulas of the language that meets specific condi- 
tions. Here “proof”, and hence “derivability", must be understood 38 relative t0 4 
particular deductive system or calculus that determines those conditions, Such sys- 
tems might be defined axiomatically, or as systems of natural deduction rules (famil- 
iar from many logic courses), or in other ways. These distinction are somewhat 
antficia, though. Axiomatic systems usually have several axioms and very few rules 
Of inference, while natural deduction systems have many rules of inference and few, 
‘oF no, axioms. Typically, axiomatic systems are difficult to work with but easier 10 
prove things about, while natural deduction systems are easy to work with, but 
harder to prove things about. One would like to have both, and, for many of the 
sorts of logic presented here, we do. 

[Relations between expressions in a language can also, however, be determined 
in terms of properties that reach beyond language, through the expressions’ eval 
ation. Then we often speak of “logical semantics’. To speak of the validity of an 
inference in terms of the relation of possible truth-values of the premises and the 
‘conclusion ~ An argument is valid if tis not possible for all ofits premises to be true 
and its conclusion false ~is to describe validity semantically. Similarly for the spec 
fication of the (semantical) consequence relation of a constructed language. To 
define relations among expressions in this way, the logician will usually describe a 
‘model, which is a structure in terms of which formulas are evaluated, and rules for 
their evaluation in a model. Then one might say that a formula, C, isa consequence 
of others, For that the inference from T”to Cis semantically vali, just in case every 
‘model that verifies every formula in T verifies C. Truth-tables are a familiar elemen- 
tary form of logical semantics. 

“The chapters that follow develop both approaches to their logics, the proof: 
theoretic and the model-theoretic or semantical, though, in keeping with contem: 
porary practice, semantics may be emphasized over proof theory. Given these two 
approaches, itis natural to ask about the relation between them. Suppose one has a 
syntactically defined derivability relation and a semantically defined relation of valid 
ity for a common language, then one relation might include the other, or they 
‘might coincide, ic., apply to exactly the same inferences. Given a semantical speci- 
fication of validity, we say that a syntactically defined logic L is sound co consistent or 
‘correct with respect to that semantics if whenever a formula Cis derivable from a set 
of formulas T in L, Cis a semantical consequence of [. This means that the rules 
of the logic never yield invalid inferences, that they do not prove too much. (Alter: 
natively, given the prior deductive system L, we might say the semantic theory is 
adequate or docs not undergenerate with respect to the derivable inferences of L-) 
If, conversely, whenever the inference from T- to C is semantically valid, C can be 
derived from T in L, then we say that L is complete with respect to that semantics, 
‘This means that L has captured all that is contained in the structures of the models, 
that it does not prove too little (or, alternatively, that the semantic theory does not 
generate too much). 

COfien itis helpful, and desirable, to know that a certain deductive system is sound 
and complete with respect to a certain pattern of interpretation. Then results estab- 
lished for one consequence relation can be readily extended to the other. But more 
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than that, when one can establish soundness and completeness for a logic, then 
fone has a sense that one has got things right. The syntactical system is likely to be 
‘transparent; one can see at a glance its commitments. It codifies the properties of the 
inference relation. The semantics keeps it from being entirely arbitrary. The relations 
defined by the logic ae, afterall, supposed to do philosophical work, Thus, many of 
the chapters here discuss and demonstrate the soundness and completeness, and 
telated properties, of their systems. There are limits, though. Not all syntactically 
defined systems of logic are complete with respect to any appropriate semantics, Not 
all semantical interpretations can be axiomatized. Even among philosophically inter- 
esting and important systems, incompleteness can arise, the most famous, and signifi- 
«cant, being the incompleteness of formalized arithmetic, discussed in chapter 4, This 
too is important to recognize, for it reveals that there are limits to what formal 
(deductive) systems can do. There are abo limits to model-theoretic semantics, 
‘What to make of those limits is a paramount question for the philosophy of logic 
and mathematics, and philosophy generally 


‘The chapters of this volume are arranged into four rough groups. Chapters 1-6 
present classical lagic, ox logic from a classical perspective. “Classical logic” here does 
not mean the logic of antiquity; itis not Aristoti’s logic. Historically, the roots of 
classical logic are seen in the nineteenth century, in the work of Boole and DeMorgan, 
and, on another front, Peirce, among others. It reached its mature growth, however, 
following the groundbreaking. work of Frege and of Whitehead and Russell, This, or 
at least its elementary part, is the sort of logic one is most likely to learn in a basic 
logic course. I will not try to define precisely what classical logic is, but leave that to 
the fist two chapters. Briefly, though, classical logic is logic that makes the simplest, 
and hence the strongest, assumptions about language, truth and consequence. It is 
logic in a narrowly circumscribed language that is twe-ralued, in the sense that every 
sentence in that language is presumed to be either true or false, but not both, and 
that is furthermore extensional, in the sense that expressions can be replaced by 
others with the same denotation or truth-value in any context and the result will 
have the same denotation or truth-value as the original, In addition, logical conse- 
quence is usually assumed to mean formal truth preservation; an argument is valid 
justin case it has a valid form, and a form is valid justin case it has no instance or 
interpretation that would make all the premises trac and the conchision false. (These 
are rough characterizations; if they prompt questions, the reader should welcome 
the chapters that follow.) 

‘Chapters I and 2 present classical logic itself, first First-Order Classical Lagic and 
then Higher-Order Classical Lagic. Set Theory, the topic of chapter 3, could be 
regarded as a part of mathematics more than logic. Whether that is so, whether it 
matters, the concepts and results of set theory are so central to contemporary logical 
inquiry, it seems valuable to include this chapter here. Also, set theory can, in a 
sense, be seen as a complement, or an alternative, to higher-order logic, and so 
chapter 3 is a natural successor to chapter 2. 

Goddel’s Incompleteness Theorems are among the premicr results concerning classical 
logic, including both higher-order logic and set theory. They establish the ineluc- 
table separation of the concepts of truth and provability within formal systems for 
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languages of a certain power. These results are widely known; how to prove them 
less so. Chapter 4 leads the reader through a number of different ways to establish 
these and other closely related results in a very transparent way, so that one can 
easily appreciate the concepts employed in the proofs, and understand the signifi- 
cance of the theorems themselves. 

At the heart of philosophical and logical inquiry is the concept of Truth, Most of 
the chapters here use this concept freely; in chapter 5, it is the object of study itself 
Here it is paramount to cope with the famous ‘Liar’ paradox and related problems, 
‘This chapter presents methods for defining truth that have been developed in recent 
years as alternatives to Tarksi’s own, now classic, account of truth that seems 50 
commonplace to many. 

Logic, we say, is the theory of the consequence relation. What this might mean, 
and what the results of formal systems of logic indicate about this relation, is the 
topic of chapter 6 on Lagical Consequence. This chapter directs attention to the 
connection between logic in the frst sense mentioned above and logic in the second. 
sense, how deductive and model-theoretic consequence for a formal system $ may 
(or may not) be reliable indicators of our pretheoretical relation of logical con: 
sequence. This chapter is a natural transition between the first group of chapters 
and the groups to follow. 

‘The chapters of the second group, chapters 7-10, present extensions of classical 
logic, where the expressive power of the classical language is augmented by the 
addition of new, non-extensional operators. Although there are other ways to ex: 
tend classical logic, these chapters all present types of modal lapic. They add non- 
truth-functional operators to express modalities of necessity and possibility (Modal 
Lagic, chapter 7), obligation, permission and prohibition ( Deontic Lagic, chapter 8), 
knowledge and beliet ( Epistemic Lagic, chapter 9), and temporality (Temporal Lagic, 
chapter 10). These are clearly motivated by philosophical purposes and applica: 
tions, and they also have important applications that go beyond philosophy, e.., in 
computer science, 

‘The chapters of the third group, chapters 11-16, describe a number of different 
alternatives to classical logic. Each one challenges one or more of the classical 
assumptions about truth or consequence, such as the law of bivalence (that every 
sentence of the language is either true or false), oF that validity can be defined solely 
in terms of relations of truth-value, and develops the logic that results from an 
altemative or more general point of view. (Sometimes classical logic can be seen a8 
a special case of the alternative.) Thus, Intuitionistic Lagic, chapter 11, works from 
quite a different understanding of truth and the meanings of the logical connectives 
and operations; here the idea of constructions, including. preefi is of central impor- 
tance. Free Lagics, chapter 12, are free of the assumption that singular (also general) 
terms must have existential import, while they still regard quantifiers in the ordinary 
ways 28 ranging only over what exists. Relevant Lagics (often called relevance logics), 
chapter 13, require that for an inference to be valid there must be a connection of 
meaning, oF use, ic., some relevance, between premises and conclusion; among, 
other things, these logics are particularly designed to avoid the various paradoxes of 
implication. Many Valued Lagics, chapter 14, reject the clasical law of bivalence 
and, as the name suggests, allow that there be more than the two truth-values, true 
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and false; there are many ways this can be done, and these logics have a wide range 
of applications. Nonmonotomic Lagic, chapter 15, a8 its name too suggests, is con- 
‘cemed with consequence relations that are not monotonic, ic., that allow for infer- 
‘ences where a conclusion follows validly from a set of premises, but when additional 
premises are added to the argument, the resulting inference is no longer valid. Such 
logics are particularly important for the fields of AI and cognitive science. In Prob- 
ability Lagic, chapter 16, statements are evaluated not only in terms of truth-value, 
but also probability, and validity is defined not just in terms of the preservation of 
truth, but also the transmission of probabilities. This too proves important for 
analyzing ordinary reasoning. 

‘The fourth group of chapters examines more closcly some of the concepts that 
scem particulary fundamental to logic and all of the sorts of logic presented in the 
earlier chapters. Chapter 17 addresses Conditionals, especially the indicative ‘if... then 
as it occurs in natural languages. Many regard the conditional as she central 
Connective of logical inquiry. Chapter 18 examines Negation, a concept that has 
vexed philosophers since the most ancient times, and looks at a variety of forms it 
can take. Chapter 19 reveals how, even at the level of first-order languages, Quanti- 
{fiers can be very much richer than one might suppose from a typical first course in 
logic. These three concepts are central not only to formal logic, but also to the logic 
‘of our natural languages, and 10, in these chapters, we see increasing attention to 
the connection between the two. The relation between Lagi, that is, formal logic, 
and Natural Language is the subject of our final chapter, chapter 20, which focuses 
especially on varieties of formal semantics for natural languages; this is a thriving 
area of research, especially in theoretical linguistics today. 

Although the divisions between these four groups are convenient, they are also 
somewhat artificial, and so I have not divided the volume into discrete “Parts.” For 
cxample, modal logic, here presented as an extension of classical logic, was originally 
introduced by C. 1. Lewis as a theory of “strict implication” that was meant to be an 
alternative to classical ‘material implication’ with the aim of avoiding what were seen 
as paradoxical consequences ifthe classical connection were truly taken to be implica- 
tion. Similarly, although relevant logic is described here as an alternative to classical 
logic, based on concerns similar to Lewis's, in an important sense it also contains 
classical logic, and so might be considered an extension of that logic. Godel’s 
Incompleteness Theorems are not limited to strictly classical languages. Likewise, 
the issues of truth and of logical consequence discussed in chapters 5 and 6 span the 
full range of logical perspectives. 


‘The 20 chapters of this volume present a central core of philosophical logic today. 
But for the limits of a volume like this, however, there could have been many more 
subjects treated. There could easily have been separate chapters on Proof Theory, on 
‘Combinatory Logic, the 4-calculus, and Type Theory, or on Algebraic Logic, which 
are touched on in various chapters here. There could have been a chapter on 
Correspondence Theory, which concerns relations between modal logics and first- 
order logic. Relevant Logic (chapter 13) is a kind of Paraconsistent Logic, and itis 
1 kind of Substructural Logie. Both of those could have provided separate chapters 
Of their own, Dynamic Logic i a sort of modal logic that can be important for the 
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analysis of action; it also has particular application within computer science. There 
could have been chapters on the Logic of Questions, and of Imperatives. There 
‘could have been chapters on other logical connectives and operators, like conjunc- 
tion and disjunction. There could have been a chapter on alternative interpretations 
fof quantifiers, such as substitutional or truth-Functional interpretations that contrast 
with the domain-of-discourse interpretation primarily employed here. There are 
further topics on the relation berween the formal languages of logic and natural 
language. There could have been chapters on names, and predicates, on definite 
descriptions, on indexicals. And so it goes. This, however, is an introductory volume, 
and from this introduction, the reader should have the grounding to pursue those 
other topics, and more. 

Furthermore, the 20 chapters that are here are only introductions to their sub- 
jects. Each could easly say far, far more about those subjects, and it is only due to 
the limitations of a volume like this that they do not. There are more variations 
Within these logics, more propertics, more results and more methods of proof that 
ultimately are essential for fully understanding the subjects than could be discussed 
here. Since each chapter is only an introduction, each one includes a list of Sug 
gested Further Reading to guide the reader to additional sources, The forthcoming 
Blackwell Companion to Phileophical Lagic, edited by Dale Jacquctte, will also offer 
further discussion of these topics, and many others, and so provides a natural com- 
panion to this volume. Another very valuable resource is the Handbook of Philesoph- 
ical Lagic, edited by Dov Gabbay and Franz Guenthner. The first edition (D. Reidel, 
Dordrecht; 1984-89) is in four volumes, with monograph-length articles on many 
of the same subjects as the chapters here, as well as many others, The second 
edition, edited by Gabbay and now in production, is projected to comprise over 14 
volumes! The present Guide might be considered a stepping stone to that monu- 
mental work. Beyond these sources, one should look to the books, monographs and 
journal articles, of the authors here, and of countiess others. That is where the great 
‘ongoing work of philosophical logic occurs. 


A glance at the chapters here will reveal a wide aray of different logical systems, 
What is one to make of such profusion? Logicians all profess to be interested in the 
same thing, logical consequence. Why then are there so many logics? Questions like 
these come to the fore when one considers the relations between the two senses of 
‘logic’ mentioned above. In the first sense, there are so many logics because there 
are so many languages that logicians devise. But when one is interested in logic in 
the second sense, in logic as a theory of valid reasoning in a sense that will do 
philosophical work, then one might naturally wonder which, if any, of all these 
logics, in the first sense, is correct and which is best t0 rely on. 

To some extent, of course, the different kinds of logics complement each other. 
For example, one could well combine alethic modal logic with deontic logic or 
epistemic logic, etc., and ultimately one would want to, in order to be able to 
express and analyze arguments and claims of mixed modes (e-g., that ‘ought’ implies 
‘can’). And one can develop relevant or intuitionistic modal logics, to bring those 
perspectives on the consequence relation to the study of the modalities, just as there 
are also relevant and intuitionistic set theories and higher-order logics. Yet, within 
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‘each specialized category, one is still confronted with a wide variety of different 
systems. There are infinitely many modal logics. Many, indeed most, might have 
merely mathematical interest, if even that, but still there will be several that have 
serious claim to philosophical significance. Are they all correct, or should one select 
just one as the real logic of necessity? (Some years ago, a distinguished logician is 
reported to have remarked that, although there are many modal logics, $4 is the 
tue one. One rarely hears such sentiments today.) Or consider the law of the 
‘excluded middle, p oF not-p. Classical logic and relevant logic, for example, are 
‘committed to this principle, while intuitionistic logic and many-valued logics deny 
it. Is the law true of not? How one answers would seem to force one toward one 
sort of logic and away from another. (With wit, though, one might answer ‘neither’; 
with more wit one might answer ‘both’. Where does one go then?) In a similar vein, 
classical logic and intuitionistic logic both embrace the validity of the inference 
scheme ex fads quodlibes, that a contradiction implies anything ~ p and not-p, $0 4 ~ 
‘which relevant logic rejects. Again, it would seem that one cannot have it both ways, 
and thus that, at least with respect to this, one sort of logic gets it right, while 
another does not. 

Considerations like these might naturally lead one to think there must, after all, 
be one true logic that accurately captures the consequence relation, and that the 
various systems are competitors to that claim. But perhaps itis not so simple, The 
{questions themselves might be complex, and require complex and multiple answers. 
For example, consider modal logic again and the number of interesting, systems 
there. One might maintain that each captures a different sense of necessity, and 
hence they are not really rivals at all. Similarly, with respect to the law of the 
‘excluded middie, one might say that the classical logician and the intuitionist are 
simply working with different concepts of disjunction and negation, and 90, again, 
there is no real disagreement. Perhaps all the logics can be combined into a great 
potpourri and all are correct (like the complementation strategy above). On the 
‘other hand, while such an ecumenical spirit could seem congenial to some, I suspect 
it distorts the way the different logics regard their own work, and what they cach 
have to say. There are other responses, however, that might allow one to accept 
multiple sorts of logic as equally correct, complements more than rivals, yet still not 
susceptible to being combined. For it might be that, while there are not multiple 
concepts of necessity, for example, or more importantly, of logical consequence 
itself, as they are ordinarily understood, nevertheless, these pretheoretical concepts 
are protean enough not to be captured by a single formal system, It might be that, 
in certain settings, certain criteria of necessity or validity are paramount while, in 
others, others are. Instead of asking which is the one true correct logic, perhaps one 
should ask first, True to what?, Correct for what? 

‘These are substantial questions that deserve philosophical attention. The place to 
begin is with a thorough understanding of the structures of the logics themselves, 
and their motivations. That is the purpose of the chapters of this volume. 











Chapter 1 


Classical Logic I: 
First-Order Logic 


Wilfrid Hodges 


1.1. First-Order Languages 


“The word ‘logic’ in the ttle of this chapter is ambiguous. 

In its first meaning, a lagic is a collection of closely related artificial languages. 
‘There are certain languages called first-order languages, and together they form first- 
‘order logic. In the same spirit, there are several closely related languages called 
‘modal languages, and together they form modal logic. Likewise second-order logic, 
deontic logic and so forth 

In ity second but older meaning, Jegic is the study of the rules of sound argument. 
First-order languages can be used as a framework for studying rus of argument; 
logic done this way is called first-order lagic. The contents of many undergraduate 
logic courses are first-order logic in this second sense. 

‘This chapter will be about first-order logic in the first sense: a certain collection of 
artificial languages, In Hodges (1983), I gave a description of first-order languages 
that covers the ground of this chapter in more detail, That other chapter was meant 
to serve as an introduction to first-order logic, and so I started from arguments in 
English, gradually introducing the various features of first-order logic. This may be 
the best way in for beginners, but I doubt if it is the best approach for people 
seriously interested in the philosophy of first-order logic; by going gradually, one 
blurs the hard lines and softens the contrasts. So, in this chapter, I take the opposite 
route and go straight to the first-order sentences. Later chapters have more (0 say 
about the links with plain English. 

‘The chief pioncers in the creation of first-order languages were Boole, Frege and 
C. S. Peirce in the nineteenth century; but the languages became public knowledge 
only quite recently, with the textbook of Hilbert and Ackermann (1950), fist 
published in 1928 but based on lectures of Hilbert in 1917-22. (So first-order logic 
has been around for about 70 years, but Aristotle's sylogisms for well over 2000 
years. Will first-order logic survive s0 long?) 

From their beginnings, firs-order languages have been used for the study of 
deductive arguments, but not only for this ~ both Hilbert and Russell used first 
order formulas as an aid to definition and conceptual analysis. Today, computer 
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science has still more uses for first-order languages, e.g., in knowledge representa- 
tion and in specifying the behavior of systems. 

You might expect at this point to start learning what various sentences in first- 
order languages mean, However, first-order sentences were never intended to mean 
anything; rather they were designed to express conditions which things can satisfy 

id 10 satisfy. They do this in two steps. 

First, each firsvorder language has a number of symbols called nowleyical com- 
stants, older writers called them primitives. For brevity, 1 shall call them simply 
constants, To use a firsvorder sentence @, something in the world ~ a person, a 
‘number, a colour, whatever ~ is attached (or in the usual jargon, assimned) to each 
Of the constants of @ There are some restrictions on what kind of thing can be 
assigned to what constant; more on that later. The notional glue that does the 
attaching is called an interpretation or a structure or a valuation. These three words 
have precise technical uses, but for the moment ‘interpretation’ is used as the least 
technical term. 

Second, given a first-order sentence @ and an interpretation Iof 6, the semantics of 
the first-order language determine cither that makes @ truc, or that I makes ¢ false. 
If I makes ¢ true, this is expressed by saying that J satisfies @, or that Tis a model of 
6, oF that @ is true in Tor under I. (The most natural English usage seems to be 
‘true in a structure’ but ‘true under an interpretation.” Nothing of any importance 
hangs on the difference between “under” and ‘in,” and I will not be entirely consis- 
tent with them.) ‘The sruth-ralue of a sentence under an interpretation is Truth if 
the interpretation makes it true, and Falsehood if the interpretation makes it false, 

‘The main difference between one first-order language and any other lies in its set 
Cf constants; this set is called the signature of the language. (First-order languages 
can also differ in other points of notation, but this shall be ignored here.) If o is 
a signature of some first-order language, then an interpretation is said to be of 
signature @ if it attaches objects to exactly those constants that are in ¢. So an 
interpretation of signature o contains exactly the kinds of assignment that are needed 
to make a sentence of signature 6 true or false. 

Examples of firs-order languages must wait until some general notions are intro- 
duced in the next section, but as a foretaste, many first-order languages have a 
sentence that is a single symbol 








4 
pronounced ‘absurdity’ or ‘bottom.’ Nobody knows or cares what this symbol 


‘cans, but the semantic rules decree that itis false. So, it has no models. It is not a 
nonlogical constant; its truth-value does not depend on any assignment. 


1.2. Some Fundamental Notions 


In the definitions below, it is assumed that some fixed signature o has been 
chosen; the sentences are those of the first-order language of signature ¢ and the 
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interpretations are of signature ¢. So each interpretation makes each sentence either 
truc 0 fae: 


true? false? 
interpretations I'¢+ sentences ¢ 


This picture can be looked at from cither end. Starting from an interpretation 1, it 
can be used as a knife 10 cur the class of sentences into two groups: the sentences 
‘hich it satisfies and the sentences which it does not satisfy. The sentences satisfied 
by Tare together known as the (firs-order) theory of I. More generally, any set of 
sentences is called a theory, and 1s a model of a theory Tif itis a model of every 
sentence in T- By a standard mathematical convention, every interpretation is a 
‘model of the empty theory, because the empty theory contains no sentence that is 
false in the interpretation, 

‘Alternatively, the picture can be read from right to left, starting with a sentence 
‘The sentence separates the class of interpretations into two collections: those 
which satisfy it and those which do not. Those which satisfy @ are together known 
as the model clas of @. In fat, a similar definition can be given for any theory T: the 
model class of Tis the class of all interpretations that satisfy 7 If particular class 
K of interpretations is the model class of a theory T, then Tis a st of axioms for K. 
‘This notion is important in mathematical applications of firs-onder logic, because 
many natural classes of structures ~ e.g, the class of groups ~ are the model classes, 
of first-order axioms, 

‘Two theories are said to be lapically equivalent, or more briefly equivalent, if they 
have the same model clas. As a special case, two sentences are said to be equivalent 
if they are true in exactly the same interpretations. A theory is said to be (semanti- 
cally) consistent fit has atleast one model; otherwise, it is (semantically) inconsistent 
‘There are many semantically inconsistent theories, for example the one consisting 
Of the single sentence ‘1°, The word “semantically” is a warning of another kind of 
inconsistency, discussed at the end of section 1.8. 

Suppose T is a theory and y is a sentence. Then T entails y if there is no 
interpretation that satisfies T but not y. Likewise, y is valid if every interpretation 
makes y true. One can think of validity as a special case of entailment: a sentence is 
valid if and only if it is entailed by the empty theory. 

‘The symbol "+ is pronounced “turnstile.” A sequent is an expression 


Thy 
where Ton the left is a theory and on the right is a sentence, The sentences in 
Tare called the premies ofthe sequent and ¥ is called its conclusion, The sequent is 
valid if T entails y, and imvalid otherwise. If T is a finite set of sentences, the 
sequent ‘T+ y' can be written as a fnite sequent 

Oe EW 


listing the contents of T'on the left. The language under discussion (i.e. the first 
order language of signature 0) is said to be decidable if there is an algorithm (ie. a 
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mechanical method which always works) for telling whether any given finite sequent 
is valid 
A proof calculus C consists of 


(2) set of rules for producing patterns of symbols called formal proof or deriva- 
tions, and 

(ii) a rule which determines, given a formal proof and a sequent, whether the 
formal proof is a proof of the sequent. 


Here ‘proof of” is just a set of words; but one of the purposes of proof calculi is that 
they should give ‘proofs of” all and only the valid sequents. The following defini- 
tions make this more precise: 
1A sequent 
Try 
is derivable in C, oF in symbols 
They 
if some formal proof in the calculus Cis a proof of the sequent. 
2A proof calculus Cis correct (or sound) if no invalid sequent is derivable in C. 
3° Cis complete if every valid sequent is derivable in C. 
So a correct and complete proof calculus is one for which the derivable sequents are 
‘exactly the valid ones. One of the best features of first-order logic, from almost 


anybody's point of view, is that it has several excellent correct and complete proof 
uli. Some are mentioned in section 1.8. 





1.3. Grammar and Semantics 


‘As in any language, the sentences of first-order languages have a grammatical struc- 
ture. The details vaty from language to language, but one feature that all first-order 
languages share is that she grammatical seructure of any given sentence is uniquely 
determined. There are no grammatically ambiguous sentences like Chomsky's 


‘They are fying plancs 


‘This property of first-order languages is called the unique parsing property. 

To guarantce unique parsing, first-order formulas generally have a large number 
of brackets. There are conventions for leaving out some of these brackets without 
introducing any ambiguity in the parsing. For example, ifthe first and last symbols 
of a sentence are brackets, they can be omitted. Any elementary textbook gives 
further details. 
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For historical reasons, there is a hitch in the terminology. With a firstorder 
language, the objects that a linguist would call ‘sentences’ are called formulas (or in 
some older writers well-formed formulas or wit), and the word ‘sentence’ is reserved 
for a particular kind of formula, as follows 

Every first-order language bas an infinite collection of symbols called variables 


Koy Sie Xs 
‘To avoid writing subscripts all the time, it is often assumed that 
SHE mE 


and a few similar symbols are variables too. Variables are not in the signature. From 
a semantic point of view, variables can occur in two ways: when a variable at some 
point in a formula needs to have an object assigned to it to give the formula a truth- 
Value, this occurrence of the variable is called free; when no such assignment is 
needed, it is called bound. A sentence is a formula with no free occurrences of 
variables. To avoid confusing variables with constants, an assignment of objects to 
the variables is called a valuation. So, in general, a first-order formula needs an 
interpretation J of its constants and a valuation v of its variables to have a truth: 
value. (It will always be clear whether ‘y’ means a variable or a valuation.) 

‘The definitions of the previous section all make sense if sentence’ is read as “irst- 
‘order sentence’; they also make sense if sentence’ is read as ‘first-order formula’ and 
‘interpretation’ as ‘interpretation plus valuation’, Fortunately, the two readings do 
not clash; for example, a sequent of first-order sentences is valid or invalid, regard: 
less of whether the first-order sentences are regarded as sentences or as formulas. 
‘That needs a proof - one that can be left to the mathematicians. Likewise, according 
to the mathematicians, a firs-order language is decidable in terms of sentences if 
and only if it is decidable in terms of formulas. (Be warmed though that “first-order 
theory’ normally means ‘set of first-order sentences’ in the narrow sense. To refer to 
a set of first-order formulas itis safest to say ‘set of first-order formulas”) 

‘The next few sections present the semantic rules in what is commonly known as 
the Tarski stle (in view of Tarski (1983) and Tarski and Vaught (1957). In this 
style, to find out what interpretations-plus-valuations make a complex formula @ 
tue, the question is reduced to the same question for certain formulas that are 
simpler than @. The Tarski style is not the only way to present the semantics. A 
suggestive alternative is the Henkin-Hintikka description in terms of games; see 
Hintikka (1973, ch. V), or its computer implementation by Barwise and Etchemendy 
(1999). Although the Tarski-style semantics and the game semantics look very 
different, they always give the same answer to the question: ‘What is the truth-value 
of the sentence @ under the interpretation 17° 

‘Throughout this chapter, symbols such as“, ‘a? are used to range over the 
formulas or terms of a first-order language. They are mctavariable, in other words, 
they are not in the first-order languages but are used for talking about these lan- 
suages. On the other hand, so as not to saddle the reader with still more metavariables 
for the other expressions of a first-order language, for example, when making a 
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‘general statement about variables, atypical variable may be used as if it was 2 meta- 
variable. Thus ‘Consider a variable «is common practice. More generally, quotation 
marks are dropped when they are more of nuisance than a help. 


14, The First-Order Language with Empty Signature 


The simplest first-order language is the one whose signature is empty, In this 
section, this language is referred to as L. 

‘An interpretation J with empty signature does not interpret any constants, but ~ 
for reasons that will appear very soon ~ it does have an associated class of objects, 
called its universe or domain. (The name ‘domain’ is perhaps more usual; but it 
hhas other meanings in logic $0, to avoid confusion, ‘universe’ is used instead.) Most 
Jogicians require that the universe shall have at least one object in it; but, apart 
from this, it can be any class of objects. ‘The members of the domain are called 
the elements of the interpretation; some older writers call them the individuals, A 
valuation in 1 is a rule ¥ for assigning to cach variable x, an element (13) in the 
universe of I 

For the grammatical constructions given here, an interpretation J and a valuation 
in Tare assumed. The truth-values of formulas will depend partly on I and partly 
con ». A formula is said to be trae im I under ». 

Some expressions of L are called atomic formulas. There are two kinds: 








‘Every expression of the form ‘(x= y)", where x and y are variables, is an atomic 
formula of L. 
© 41" is an atomic formula of L. 


It has already been noted that “" is false im I. The truth-value of (x =.) t0 take 
a typical example of the other sort of atomic formula, is, 


‘Truth if v(x,) is the same element as ¥(x;) 

Falsehood if not 

So, given I and », truth-values are assigned to the atomic formulas 

Neat, a class of expressions called the formulas of L is defined. The atomic 
formulas are formulas, but many formulas of L are not atomic. Take the five symbols 
~& vas 

which are used to build up complex formulas from simpler ones. There is a gram- 
matical rule for all of them. 

If @ and y are formulas, then so are each of these expressions: 


(9) kW) vy (2 (=y 
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This chart, called a sruthrtable, shows which of these formulas are true, depending, 
fon whether @ and y are true: 





ov 

TT) Fr T T T T 
TE F T F FE 
FT) T F T T F 
FF F F T T 


(Here T = Truth and F = Falsehood.) Because of this table, the five symbols *~*, 
$8’, V', "D', and *=" are known as the sruth-functional symbols. 
For example, the formula 





(((2 = 84) 8 (5 83)) 2 (= 89)) 


is false im just one case, namely where x(x;) and 1(x,) are the same clement, and 
(x) and 9x3) are the ame element, but r(x) and ¥() are not the same element, 
Since this case can ever arse, the formula is true regardless of what [and » are. 

“There remain just two grammatical constructions. The grammatical rule for them 
both is: 


1 @ is any formula and xis any variable, then the expressions 
(vse Axe 
are both formulas. 


‘The expressions ‘(Vx)’ and "(2x)" are called respectively a wniversal quantifier and. 
an existential quantifier, and read respectively a8 “for all x° and ‘there is x”. In the 
two formulas given by the rule, the occurrence of « inside the quantifier is said to 
bind iself and any occurrences of the same variable in the formula @. These occur- 
rences stay bound as still more complex formulas are built. In any formula, an 
‘occurrence of a variable which is not bound by a quantifier in the formula is said 
to be free. A formula with no free variables is called a sentence. (And this is what 
‘was meant earlier by ‘sentences’ of first-order languages. The syntactic definitions 
just given are equivalent to the semantic explanation in the previous section.) For 
‘example, this is not a sentence 


X~ (x= 9) 


because it has a free occurrence of x (though both occurrences of y are bound by the 
‘existential quantifier), But this is a sentence: 


(X39 ~ (X= 9) 


because its universal quantifier binds the variable x. 
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‘The semantic rules for the quantifies are one of the harder concepts of first-order 
logic. For more than two millennia, some of the best minds of Europe struggled to 
formulate semantic rules that capture the essence of the natural language expressions 
‘all and “there is.” 


+ “(Waxi9" is true in J under » if for every element a in the universe of 1, if w is 
taken to be the valuation exactly like » except that w(x) is a, then @ is truc in 
under w. 

© \(Gx)¢’is true in Funder vif: there is an element a in the universe of [, such that 
iff wis taken to be the valuation exactly ike v except that w(x) is a, then ¢ is true 
in Tunder w. 


For example, the formula 
B~ (x=) 


is true in Funder », if (ifand only if) there is an element # such that v(x) is not the 
same element as a. So the sentence 


(Wx 39M ~ (x= 9) 


is true in I under riff for every element b there is an clement a such that bis not the 
same element as a, In other words, i is true iff the universe of I contains at least two 
different elements. 

Note that this last condition depends only on J and not at all on ». One can prove 
that the truth-value of a formula @ in and never depends on v(x) for any variable 
«that does not occur free in @. Since sentences have no free variables, their truth: 
value depends only on I and the valuation slips silently away. 

‘These rules capture the essence of the expressions ‘all’ and ‘there is" by stating 
precise conditions under which a sentence starting with one of these phrases counts 
as true, The same applies to the truth-functional symbols, which are meant, in some 
sense, t0 capture at least the mathematical use of the words ‘not’, ‘and’, ‘or’, “if. .. 
then’, and ‘if and only if 





1.5. Some Notation 


‘The notation in this section applies to all first-order languages, not just the language 
with empey signature. 

Writing a formula as @(x),-.., x), where 2), .., x, are different variables means 
that the formula is @ and it has no occurrences of free variables except perhaps for 
Siyeeey Me Then 


Te Laisa 
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‘means that ¢ is true in the interpretation J and under some valuation ¥ for which 
W(s)y +5 WC) are yy. respectively (or under any such valuation » —it makes 
to difference). When @ is a sentence, the #,,..., a, are redundant and 





Ike 


simply means that ¢ is true in 1 
Here is another useful piece of notation: 


9/%) 
means the formula obtained by replacing each free occurrence of x in @ by an 
occurrence of x. Actually, this is not quite right, but the correction in the next 
paragraph is rather technical, What is intended is that the formula @(7/x) says about 
1 the same thing that @ ‘said about 2° 

Suppose, for example, that ¢ is the formula 

(YoKx= 9) 


which expresses that x is identical with everything. Simply putting yin place of each 
free occurrence of x in @ gives 


(y=) 





is says that cach thing is identical to itself; whereas the intention was to make the 
‘more interesting statement that yis identical with everything. The problem is that y 
is put into a place where it immediately becomes bound by the quantifier (V7). So 
(7/8) must be defined more carefully, a follows. First, choose another variable, say 
5 that does not occur in @, and adjust @ by replacing all bound occurrences of y in 
@ by bound occurrences of «. After this, substitute y for free occurrences of x. (SO 
6(9/x) in our example now works out a8 





(Wa y=2) 


Which says the right thing.) This more careful method of substitution is called 
substitution avviding clash of variables. 

‘The language 1 of the previous section is a very arid first-order language. The 
conditions that it can express on an interpretation [are very few. It can be used to 
say that I has at least one element, at least two clements, at least seven elements, 
cither exactly a hundred or at least three billion clements, and similar things; but 
nothing else. (Is there a single sentence of L which expresses the condition that has 
infinitely many elements? No. This is a consequence of the compactness theorem in 
section 1.10.) 

Nevertheless L already shows some very characteristic features of first-order 
languages. For example, to work out the truth-value of a sentence ¢ under an 
interpretation J, one must generally consider the truth-values of subformulas of ¢ 
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under various valuations. As explained in section 1.3, the notion of a valid sequ- 
ent applies to formulas as well as sentences; but for formulas it means that every 
interpretation-plus-valuation making the formulas on the left true makes the formala 
fon the right true too. 

Here are two important examples of valid sequents of the language ZL. The sequent 


+ (sea) 
is valid because »(x) is always the same element as r(x). The sequent 
(= NF (9D H9/) 


is valid because if two given elements are the same element, then they satisfy all the 
same conditions. 


1.6. Nonlogical Constants: Monadic First-Order Logic 


Section 1.5 ignored the main organ by which first-order formulas reach out to the 
‘world: the signature, the family of nonlogical constants. 

“The various constants can be classified by the kinds of feature to which they have 
to be attached in the world. For example, some constants are called class symbols 
bbecause their job isto stand for classes. (Their more technical name is 1-ary relation 
symbol.) Some other constants are called individual constants because their job is to 
stand for individuals, ic. elements. This section concentrates on languages whose 
signature contains only constants of these two kinds. Languages of ths type are said 
to be monadic. Let L be a monadic language. 

Usually, individual constants are lower-case letters ‘a’, #,‘c' ete. from the first 
half of the alphabet, with or without subscripts. Usually class symbols are capital 
letters *P", *Q', *R’ etc. from the second half of the alphabet, with or without 
number subscripts. 

Grammatically these constants provide some new kinds of atomic formula, It is 
helpful first to define the terms of L. There are two kinds: 





© Every variable is a term. 
‘© Every individual constant is a term. 


The definition of atomic formula needs revising: 


‘© Every expression of the form ‘(a= f), where a and fi are terms, is an atomic 
formula of 1. 

‘© If Pis any class symbol and @ any term, then ‘P(a)’ is an atomic formula. 

© ‘17 isan atomic formula of L. 


Apart from these new clauses, the grammatical rules remain the same as in sec 
tion 1.4, 
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What should count as an interpretation for a monadic language? Every interpreta- 
tion I needs a univene, just as before. But now it also needs to give the truth-value 
of P(x) under a valuation that ties «to an element »(x), which might be any element 
of the universe. In other words, the interpretation needs to give the class, written P”, 
of all those elements a such that P(x) is true under any valuation v with v(x) equal 
to a, (Intuitively P's the class of all elements that satisfy P(x) in 1.) Here P! might 
bbe any subclass of the universe. 

In the branch of logic called model theory, the previous paragraph turns into a 
definition. A structure is an interpretation I of the following form: 


Thas a universe, which is a set (generally taken to be non-empty) 
For each class symbol P in the signature, I picks out a corresponding class P', 
called the interpretation of P under 1 all of whose members are in the universe 

‘+ For each individual constant # in the signature, 1 picks out a corresponding ele 
‘ment a! in the universe, and this element i called the interpretation of @ under I 


Writing @ for the signature in question, this interpretation is called a orstructure 
So a orstructure contains exactly the information needed to give a truth-value to a 
sentence of signature @, Note that the interpretations a! are nceded to deal with 
sentences such as Pa). (The requirement that the universe should be a set rather 
than a class is no accident: a set is a mathematically well-behaved class. The precise 
difference is studied in texts of set theory. [See chapter 3.]) 

However, this model-theoretic definition is not as old as first-order logic. In the 
carly days, logicians would give an interpretation for a by writing down a name or a 
description of a thing or person. They would give an interpretation for P by writing 
down a sentence of English or their own native language with x in place of one or 
more pronouns. Or sometimes they would write a sentence with ‘He’ or ‘It’ left off 
the front; or more drastically, a sentence with “He is a’ left off. For example an 
interpretation might contain the items: 


P_ xis kind to people who are kind to x 
Q + is moral 
R taxpayer 


‘The third style here is the least flexible, but anyway itis not needed; it can easly be 
‘converted to the second style by writing ‘is a taxpayer.” The second style in turn is 
less flexible than the first, and again is not needed. Q and R could be written ‘x is 
mortal’, ‘x is a taxpayer’, A sentence with variables in place of some pronouns is 
sometimes called a predicate. 

Can every interpretation by predicates be converted into a structure? Yes, pro- 
vided that each of the predicates has a certain property: the question whether an 
clement of the universe satisfies the predicate always bas a definite answer (Yes or No) 
wich depends only on the element and not om bow itis described. Predicates with this 
property are said to be extensional. The following predicates sem not to be extensional 
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~ though this is an area where people have presented strong arguments for some 
‘quite surprising conclusions: 


is necessarily equal t0 7. 
1 recognized + 


‘The prevailing view is that to handle predicates like these, a logic with 
semantics than first-order logic is needed. Modal logic takes on board the 
example, epistemic logic the second. [See chapters 7 and 9] 

‘The predicate 


xis bald, 


also fails the test, not because itis possible to be bald under one name and bushy- 
haired under another, but because there are borderline cases ~ people who aren't 
definitely bald or definitely not bald. So this predicate docs not succeed in defining 
a class of people. Truth to tell, most natural language predicates are at least slightly 
vague; even logicians have to live with the roughnesses of the real world, 

Given an interpretation J that uses predicates, a first-order sentence ¢ can often be 
translated into an English sentence which is guaranteed to be true if and only if @ 
is truc in J. The translation will generally need to mention the universe of the inter- 
pretation, unless a predicate is used to describe that too. Here are some examples, 
using the interpretation a couple of paragraphs above, together with the universe 
described by ‘x is a person’ 


(Ya Rix) > Q(x) 
Every person who is a taxpayer is mortal 


(2e)P(x) 
‘At least one person is kind to people who are kind to him or her. 


(Bx) R(x) & (VR) = (Y=) 
Exactly one person is a taxpayer. 


‘The reader may well agree with the following comment: If these first-order sen- 
tences are being used to express the English sentences in question, then i is artificial 
to ask for a universe at all. In ordinary speech, no one asks people to state their 

This comment needs answers on several levels. First, mathematical objects ~ such 
as groups, rings, boolean algebras and the like ~ consist of a set of objects with 
certain features picked out by nonlogical constants. So it was natural for the math- 
ematical creators of first-order logic to think of this set of objects as a universe. 

Second, there is a mathematical result that takes some of the sting out of the 
requirement that a universe has to be chosen. An occurrence of a universal quantifier 
is restricted if it occurs as follows: 
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(0x P(x) D---) 


ic, followed immediately by a left bracket, 2 class symbol with the same variable, 
and then ‘D". Likewise an occurrence of an existential quantifier is restricted if i 
looks like this: 


Go Pla) & +) 
‘The mathematical result states: 


Theorem 1.1 Let ¢ be a sentence of the first-order language 1 of signature 
6, and suppose that all occurrences of quantifiers in @ are restricted, Let I be a 
‘structure and J be another o-structure which comes from I by removing some 
elements which are not inside P! for any class symbol P. Then @ has the same 
truth-value in T and in J. 


First order sentences that serve as straightforward translations of English sentences 
usually have all their quantifiers restricted, as in the first and third examples above, 
(The second example can be rewritten harmlessly as 


(Bx)(Plx) & Px) 


and then its quantifir is restricted too.) So the choice of universe may be largely 
aad boc, but it is also largely irrelevant. (This theorem remains true for first-order 
languages that are not monadic.) 

Third, ifthe class symbols are interpreted by predicates rather than by classes, the 
choice of universe certainly can make a difference to truth-values, even for sentences 
‘covered by the theorem just stated. Suppose, for example, that an interpretation is 
being used, with 


Ps xis a person. 
Q : xwill be dead before the year 2200. 


With such an interpretation, the sentence 
(Wx) P(x) > QL) 


expresses that every person will be dead before the year 2200. This is probably true 
Of people alive now, but probably false if ‘person’ includes people yet to be born. So 
different universes give different truth-values. Why does this not contradict Theorem 
LLL? Because the predicate ‘x is a person’ picks out different classes according as 
future people are excluded or included, so that the corresponding o-structures differ 
in their assignments to P and Q, not justin their universes. 

Ifa universe can contain furure people, can it contain possible people, or fictional 
people, or even impossible people (like the man I met who wasn’t there, in the 
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children’s jingle)? Or to be more metaphysical, can a universe contain as separate 
clements myself:now and myself-ten-years-ago? First-order logic is very robust about 
questions like these: it doesn’t give a damn. If you think that there are fictional 
people and that they have or fail to have this property or that, and can meaningfully 
be said to be the same individuals or not the same individuals as one another, then 
fine, put them in your universes. Likewise, if you think there are time-slices of 
people. If you don't, then leave them out. 

All these remarks about universes apply equally well to the more general first- 
order languages of section 1.7. Here is a theorem that docs not. 





‘Theorem 1.2 If Lis a monadic first-order language with a finite signature, then 
Lis decidable. 


See, for example, Boolos and Jeffrey (1974, ch. 25, *Monadic versus dyadic logic’) 
{or a proof of this theorem. 


1.7. Some More Nonlogical Constants 


‘Most logicians before about 1850, if they had been set to work designing a first: 
‘order language, would probably have been happy to stick with the kinds of constant 
already introduced here. Apart from some subtleties and confusions about empty 
classes, the traditional sylogistic forms correspond to the four sentence-types 


(Ve Px) > Qlx)) (WK P(x) > ~ Qlx)) 

(xy Pls) & Q(x) AXK P(x) & ~ Q(x) 
‘The main pressure for more elaborate forms came from mathematics, where ge- 
‘ometers wanted symbols to represent predicates such as: 

is a point lying on the line x 

xis between yand = 
Even these two examples show that there is no point in restricting ourselves in 
advance to some fixed number of variables. So, class symbols are generalized to 
m-ary relation symbols, where the arity, n, is the number of distinct variables needed 
4 predicate that interprets the relation symbol. 


Like class symbols, relation symbols are usually *P", “Q’, etc., ie. capital letters 
from near the end of the alphabet. An ordered m-tuple is alist 





(Crs) 


where a, is the ith item in the list; the same object may appear as more than one 
item in the list. The interpretation R' of a relation symbol R of arity » in an 
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interpretation Lis a set of ordered s-tuples of elements in the universe of I. If R'is 


specified by giving a particular predicate for R, then which variables of the predicate 
belong with which places in the lists must also be specified. An example shows how: 





R(x, 9, ®) between x and y 


Class symbols are included as the relation symbols of arity 1, by taking a list 
(@) 


‘of length 1 to be the same thing 28 its unique item 

‘There can also be relation symbols of arity 0 if it is decided that there is exactly 
‘one list () of length 0. So the interpretation 9 of a O-ary relation symbol p is either 
the empty sct (call it Falsehood) or else the set whose one element is () (call tis set 
‘Truth). All this makes good sense set-theoretically. What matters here, however, is 
the outcome: relation symbols of arity 0 are called propositional symbols, and they are 
always interpreted as Truth or as Falsehood. A sentence which contains neither ‘=, 
‘quantifiers nor any nonlogical constants except propositional symbols is called 
ropesitional sentence. Propositional logic is about propositional sentences 

‘The language can be extended in another way by introducing nonlogical symbols 
called n-ary function symbols, where 1 is a positive integer. The interpretation F of 
such a symbol Fis a function which assigns an element of I to each ordered n-tuple 
of elements of 1. (Again, there is a way of regarding individual constants as O-ary 
function symbols, but the details can be skipped here.) 

‘The new symbols require some more adjustments to the grammar, The clause for 
terms becomes: 


Every variable is a term, 

* Every individual constant is a term. 

+ If Fis function symbol of arity %, and ay... ., a are terms, then “Flay... 5)” 
a term. 


‘The definition of atomic formula becomes: 





* Every expression of the form ‘(a= 
formula of L. 

* Every propositional symbol is an atomic formula 

* If Ris any relation symbol of positive arity m and aj,..., a are terms, then 
*R(a,..- 4)” is an atomic formula. 

4.7 isan atomic formula, 


Y, where a and fare terms, is an atomic 


‘The semantic rules are the obvious adjustments of those in the previous section. 
‘Some notation from section 1.5 can be usefully extended. If ¢ is a formula and a 
is a term, 


wa/x) 
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represents the formula obtained from ¢ by replacing all free occurrences of x by a. 
AS in section 1.5, to avoid clash of variables, the bound variables in @ may need to 
be changed first, so that they do not bind any variables in a 


1.8. Proof Calculi 


First-order logic has a range of proof calculi. With a very few exceptions all these 
proof calculi apply to all first-order languages. So, for the rest of this section assume 
that Lis a particular first-order language of signature 6. 

‘The first proof calculi to be discovered were the Hilbert-style calculi, where one 
reaches a conclusion by applying deduction rules to axioms. An example is described 
later in this section, These calculi tend to be very ad doc in their axioms, and 
‘maddeningly wayward if one is looking for proofs in them. However, they have their 
supporters, ¢.g., modal logicians who need a first-order base to which further axioms 
can be added. 

In 1934, Gentzen (1969) invented two other styles of calculus. One was the 
natural deduction calculus (independently proposed by Jaskowski slightly earlier), 
‘An intuitionistic natural deduction calculus is given in chapter 11, which, as noted 
there, can be extended to make a calculus for clasical first-order logic by the 
addition of a rule for double-negation elimination. Gentzen’s second invention was 
the sequent calculus, which could be regarded as a Hilbert-style calculus for deriving 
finite sequents instead of formulas. With this subtle adjustment, nearly all of the 
arbitrariness of Hilbert-style systems falls away, and it is even possible to convert 
cach sequent calculus pro it a sequent calculus proof in very simple frm 
called a eut;free proof. The popular tableau or truth-tree proofs are really cut-ftee 
sequent proofs tured upside down. A proof of a sequent in any of the four kinds 
of calculi ~ Hilbert-style, natural deduction, sequent calculus, tableaux ~ can be 
mechanically converted (0 a proof of the same sequent in any of the other calculi 
sce Sundholm (1983) for a survey. 

“The resolution calculus also deserves a mention. This calculus works very fast on 
computers, but its proofs are almost impossible for a normal human being to make 
any sense of, and it requires the sentences to be converted to a normal form (not 
quite the one in section 1.10 below) before the calculation starts; see, for example, 
Galler (1986). 

To sketch a Hilbert-style calculus, called %, frst define the class of axioms of 3 
‘This is the set of all formulas of the language L which have any of the following 
forms: 





HI 93(y>¢) 

H2 (93 Y)2((63 (YIN) (WD) 
H3 (~ 92 W/I(~ 6D ~ WO) 
H4 ((@D1)34)30 
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HS 93(v>(0& yi) 

H6 (&W)26(O&WI2¥ 

H7 @2(6v¥), ¥D(Ov¥) 

HB (937)2((V2 (OV W)>”) 
HO (6D y)2((¥20)2(0= w) 
H10 (9= y)3(92 y). (0= VIP1¥>0) 
HLL ¢(a/x) > 309 (a any term) 

H12_Vx9> 9(a/x) (a any term) 

HI3 x=x 

Hi4 x= ¥3(92 60/2) 


A derivation (or formal proof) in 211s defined to be a finite sequence 
(a5 myo Gon md) 


such that m= 1, and for each i (1 = i= n) one of the five following conditions 
hholds. (Clauses (c)-(c) are known as the derivation rules of 3) 


(a) m=1 and ¢, is an axiom. 

(b)_m,=2 and ¢, is any formula of L. 

(©) m/=3 and there are j and & in [1,..., 4~1) such that @ is 6 4, 

(a) m,=4 and there is j(1 = j<) such that @ has the form y—> z, isa variable 
not occurring free in y, and @, is y—> Vxz. 

(e)m,=5 and there is j(1 = j< 4) such that ¢, has the form y—> x, xis a variable 
not occurring free in x, and ¢, is 3xy—+ z. 





‘The premises of this derivation are the formulas ¢, for which m,= 2. Its conclusion is 
te. We say that y is derivable in 3 from a set T of formulas, in symbols 


They 
if there exists a derivation whose conclusion is and all of whose premises are in T; 


Proofs are usually written vertically rather than horizontally. For example here is 
proof of (¢3 @), where ¢ is any formula: 


(Q) (@3(92) (2 (3 6) 3.9) 3 (020) [Axiom H2] 
2) 921920) [Axiom Hil] 
(3) (@3((026)3 0) 2(02 6) (Rule (¢) from (1), (2)] 
(4) 023((020)36) {Axiom HI] 


5) 26 [Rule (¢) from (3), (4)] 
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To save the labor of writing this argument every time a result of the form 92 ¢ is 
needed, this result can be quoted as a lemma in further proofs. Thus ~1. can be 
proved as follows: 


Q) ~49~2 [Lemma] 
(2) (~13~4)3((~191)3(~49~1)) [Axiom H1] 
(3) (~494)3(~19~4) [Rule (¢) from (1), (2)) 
(4) (491) 3(-13~4)) 3-494) 2 (H19~4) 24) D(~491) DL) 

[Axiom H2] 


(8) (492) ((~LD~1) 94) 3 ((~L34) 24) 
[Rate (c) from (3), (4)) 


(6) (494) ((~12~1) 34) [Axiom H3] 
(7) (~434)34 [Rule (c) from (5), (6)] 
(8) (44) 4) D~L [Axiom H4] 
(0) ~4 [Rate (¢) from (7), (8)] 


‘Then this result can be quoted in turn asa lemma in a proof of (@ > 1) >~ and 

A theory T is inconsistent if there is some formula @ such that (68 ~ @) is 
derivable from T in 9 If the language 1 contains 4, then it can be shown that 
this is equivalent co saying that 4 is derivable from Tin 9 Tis consistent if it is 
not #inconsistent. %#inconsistency is one example of syntactic inconsistency, other 
proof calculi give other examples 





1.9, Correctness and Completeness 


‘Theorem 1.3 (Correctness Theorem for 1) Suppose 6... 9, and y are sen: 
tences. If ¢ is derivable in 4 from ,..., dq then the sequent 





shy 

is valid 
Proof sketch “This is proved by induction on the length of the shortest derivation of 
¥ from @,,..-, @,- Unfortunately, the formulas in the derivation need not be 


sentences. So for the induction hypothesis something alittle more general needs to 
be proved: 


Suppose ¢;,.-- @, afe sentences and y is a formula whose free variables are all 
among x;,.-- 5 If yis derivable in 3 from 9,,..., 6, then the sequent 
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is valid 
‘The argument splits into cases according to the last derivation rule used in the 
proof. Suppose, for example, that this was the rule numbered (5) above, and y is 
the formula 390 z where y is not free in z. Then, from the induction hypothesis, 
the sequent 

Payers Oh VRL WROD 2) 
is valid, Using the fact that y is not fice in z, it can be checked that the sequent 


vx, 





V9 2) F Wx W830 2) 
is valid. By this and the induction hypothesis, the sequent 
Ounces OFM W392 2) 


is valid a8 required. QED 
Now, the completeness question: 


Theorem 1.4 (Completencs Theorem for 21) Suppose that Tis a theory and yis 
a sentence such that the sequent 


Thy 
is valid. Then is derivable from Tin 36 


In fact one proves the special case of the Completeness Theorem where y is 1; in 
other words 


If Tis a theory with no models, then Thy 4. 
‘This is as good as proving the whole theorem, since the sequent 

TUl~ wit 
{is equivalent to ‘T+ y" both semantically and in terms of derivability in 3 

Here, the Completeness Theorem is proved by showing that if T is any 9% 
consistent theory then Thas a model. A technical lemma about 2s needed along 
the way: 


Lemma 1.5 Suppose cis a constant which occurs nowhere in the formula @, the 
theory Tor the sentence y. IF 
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Try Ae/ 0 

then 
Thy3ep>y 


Proof sketch of the Completeness Theorem This is known as a Henkin-style proof 
because of three features: the constants added as witnesses, the construction of a 
‘maximal consistent theory, and the way that a mode! is built using sets of terms as 
clements. The proof uses a small amount of set theory, chiefly infinite cardinals and 
ordinals. (See chapter 3.] 

Assume a #¥consistent theory T in the language L Let x be the number of 
formulas of L; xis always infinite. Expand the language L to a first-order language 
by adding to the signature a set of x new individual constants; these new con- 
stants are called witness. List the sentences of L’ as (@,: <x). Now define for each 
i< Ka theory T,, so that 





hots 


and each T, is ##consistent. To start the process, put T,= 7: When i is a limit 
‘ordinal, rake T; to be the union of the T; with j< é this theory is 2#consistent since 
any inconsistency would have a proof using finitely many sentences, all of which 
would lie in some T, with j< i 

‘The important choice is where i is a successor ondinal, say i= j+ 1. If 7, U [is 
not %consistent, take T,,, to be T, Otherwise, put T/= T; U[@,). Then if ¢ is of the 
form xy, choose a witness ¢ that appears nowhere in any sentence of T;, and put 
T= T; U [y(o/a)|; otherwise put T,,, =; By Lemma 1.5, Ty. is #4consistent in 
all these cases, 

Write 7° for the union of all the theories 7, It has the property that if @ is any 
sentence of L' for which TU {@ is #consistent, then T,U {6} was already 3 
consistent and so 9, is in T* by construction. (As noted, 7” is maximal consistent.) 
Moreover if T° contains a sentence ¢, of the form 3xy, then by construction it also 
contains y(6/x) for some witness &. 

‘Two witnesses ¢ and d are equivalent ifthe sentence ‘c= dis in T*. Now if c=? 
and ‘d= ¢ are both in T°, then (appealing to the axioms and rules of 9) the theory 
TU [ex el is consistent, and so *¢= ¢ is also in T°. This and similar arguments 
show that ‘equivalence’ is an equivalence relation on the set of witnesses. Now build 
a structure A° whose universe is the set of equivalence classes «~ of witnesses ¢. For 
example, if Pis a 2-ary relation symbol in the signature, then take P* to be the set 
of all ordered pairs (c-, a) such that the sentence *P{e, d)” is in T. There are a 
‘number of details to be checked, but the outcome is that A* is a model of T*. Now, 
stripping the witnesses out of the signature gives a structure A whose signature is 
that of L, and A is a mode! of all the sentences of L that are in T*. In particular, A 
is a model of T, as required. (Note that A has at most x elements, since there were 
only x witnesses.) QED 
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1,10. Metatheory of First-Order Logic 


‘The metatheory of a logic consists of those things that one can say about the logic, 
rather than in it, All the numbered theorems of this chapter are examples. The 
metatheory of first-order logic is vast. Here are a few high points, beginning with 
some consequences of the Completeness Theorem for 3, 


‘Theorem 1.6 (Compactnest Theorem) Suppose Tis a first-order theory, y is 
first-order sentence and T entails y. Then there is a finite subset U of T such that 
Uentails y. 


Proof If T entails y then the sequent 
Thy 


is valid, and so by the completeness ofthe proof calculus % the sequent has a formal 
proof, Let U be the set of sentences in T which are used in this proof. Since the 
proof isa finite object, U isa finite set. But the proof is also a proof of the sequent 


ury 
So by the correctness of %, U entails y. QED 


Corollary 1.7 Suppose Tis a first-order theory and every finite subset of Thas 
a model, Then has a model 


Proof Working, backwards, itis enough to prove that if Thas no model then some 
finite subset of Thas no model. If Thas no model then Tentails 1, since 4. has n0 
models. So by the Compactness Theorem, some finite subset U of Tentails 1. But 
this implies that U has no model. QED. 

‘The next result is the weakest of a family of theorems known as the Dewnward 
Liwenhcim-Stolem Theorem, 


‘Theorem 1.8 Suppose L is a first-order language with at most countably many 
Formulas, and let Tbe a consistent theory in L. Then Thas a model with at most 
countably many elements. 





Proof Assuming Tis semantically consistent, itis 2¢consistent by the correctness of 
46 So the sketch proof of the Completeness Theorem in section 1.9 constructs a 
model A of T. By the last sentence of section 1.9, A has at most countably many 
‘elements. QED 
‘There is also an Upward Liwenbeim-Stolem Tocorem, which says that every first 
‘order theory with infinite models has arbitrarily large models. 
‘A basic conjunction is a formula of the form 
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where each 9, is either an atomic formula or an atomic formula preceded by ~. 
(Note that m= is allowed, so that a single atomic formula, with or without ~, 
counts as a basic conjunction.) A formula is in disjunctive normal form if it has the 
form 


Whverv my) 


where cach y, is a basic conjunction. (Again, = 1 is allowed, so that a basic 
conjunction counts as being in disjunctive normal form.) 

A first-order formula is said to be prenes if it consists of a string of quantifiers 
followed by a formula with no quantifiers in it. (The string of quantifiers may be 
empty, so that a formula with no quantifiers counts as being prenex.) 

‘A formula is in normal form if it is prenex and the part after the quantifiers is in 
disjunctive normal form. 


‘Theorem 1.9 (Normal Form Theorem) Every first-order formula ¢ is equivalent 
to a first-order formula y of the same signature as ¢, which has the same free 
variables as @ and is in normal form. 


‘The next theorem, Lyndon’s Interpolation Theorem, deserves to be better known. 
Among other things, it is the first-order form of some laws which were widely 
known to logicians of earlier centuries as the Laws of Distribution (Hodges, 1998). 
It is stated here for sentences in normal form; by Theorem 1.9, this implies a 
theorem about all first-order sentences, 

Suppose @ is a first-order sentence in normal form. An occurrence of a relation 
symbol in ¢ is called postive if it has no *~* immediately in front of it, and meqatine 
if it has, 


‘Theorem 1.10 (Lyndon’s Interpolation Theorem) Suppose @ andy are first: 
forder sentences in normal form, and ¢ entails y. Then there is a first-order 
sentence @ in normal form, such that 





@ entails @ and @ entails y 

‘+ every relation symbol which has a positive occurrence in @ has postive occur- 
rences in both @ and y, and 

‘* every relation symbol which has 2 negative occurrence in @ has negative 
oveurrences in both ¢ and y. 


Lyndon’s theorem can be proved cither by analyzing proofs of the sequent ‘@ v’, 
‘or by a set-theoretic argument using models of @ and y. Both arguments are too 
‘complicated to give here. 

‘An important corollary of Lyndon’s Interpolation Theorem is Craig’s Interpola- 
tion Theorem, which was proved a few years before Lyndon’s. 
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Corollary 1.11 (Craig's Interpolation Theorem) Suppose @ and y are first-order 
sentences, and @ entails y. Then there is a frstorder sentence @ such that 


‘© @entals @ and @ entails y 
© every relation symbol that occurs in @ occurs both in @ and in y. 


Craig's Interpolation Theorem in tum implies Beth’s Definability Theorem, which 
was proved earlier still. But all these theorems are from the 1950s, perhaps the last 
‘great age of elementary metatheory. 


Corollary 1.12 (Beth’s Defnability Theorem) Suppose 6's a first-order sentence 
in which a relation symbol R of arity w occurs, and suppose also that there are not 
two models I and J of @ which are identical except that R’ is different from 2. 
‘Then ¢ entails some first-order sentence of the form 


(am) 9 (WR) (Ym RUsiy 9 ed) 
where y is a formula in which R never occurs. 


Finally, note a metatheorem of a different kind, to contrast with Theorem 1,2 
above: a form of Church's Theorem on the Undecidabilityof First-Order Lagic. 


‘Theorem 1.13 Suppose Lis first-order language whose signature contains at 
least one wary relation symbol with » > 1. Then L is not decidable 


A reference forall the metatheorems in this section except Theorems 1.9 and 1.13 
is Hodges (1997). Theorem 1.9 is proved in both Kleene (1952, pp. 134f, 167) and 
Ebbinghaus et al, (1984, p. 126), together with a wealth of other mathematical 
information about first-order languages. Boolos and Jefficy (1974) contains a proof 
of the undecidability of first-order logic (though to reach Theorem 1.13 above from 
its results, some coding devices are needed). 


‘Suggested further reading. 


“There are many places where the subjects of this chapeer can be purwed to a deeper level, OF 
those mentioned already in this chapter, Boolos and Jeffty (1974) i a clear introductory text 
aimed at philosophers, while Hodges (1983) is a survey with an eye on philosophical issues. 
Ebbinghaus et al. (1984) is highly recommended for those prepared to face some nontrivial 
‘mathematics. Of older books, Church (1956) is sill valuable for its philosophical and histor- 
ical remarks, and Tarski (1983) is outstanding for its clear treatment of fundamental questions. 
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Chapter 2 


Classical Logic II: 
Higher-Order Logic 


Stewart Shapiro 


2.1, Introduction and Overview 





A typical interpreted formal language has (fist-order) variables that range over 
4 collection of objects, sometimes called a domain-ofidiscoure. The domain is 
what the formal language is about. A language may also contain second-order 
variables that range over properties, sets, or relations on the items in the domain: 
of discourse, oF over functions from the domain to itself. For example, the sen- 
tence ‘Alexander has all the qualities of a great leader’ would naturally be rendered 
with a second-order variable ranging over qualities. Similarly, the sentence ‘there 
is a property that holds of all and only the prime numbers’ has a variable ranging 
Cover properties of natural numbers. Third-order variables ange over properties 
of properties, sets of sets, functions from properties to sets, ete. For example, 
according to’ some logicist accounts, the number 4 is the propery shared by 
all properties that apply to exactly four objects in the domain. Accordingly, the 
number 4 is a third-order item. Fourth-order variables, and beyond, are charac 
terized similarly. The phrase ‘higher-order variable’ refers to the variables beyond 
first-order. 

‘A language is first-order if t has first-order variables and no others. A language is 
second-order if it has first-order and second-order variables and no others, etc. A 
language is higher-order if i is a least second-order. 

The study of firs-order formal Languages is sometimes called first-order logic, 
oo elementary logic. It occupies the bulk of contemporary logical theory. Most 
textbooks either ignore higher-ondcr languages o¢ cle give them passing mention 
Cor brief treatment as an afterthought. However, virtually all of the founders of 
modern mathematical logic, such as Frege (1879), Peano (1889), and Whitchead 
and Russell (1910), presented higher-order languages. First-order logic appeared 
as a separate study when some authors separated out first-order languages as sub- 
“stems for special treatment. Hilbert and Ackermann (1950 [1928]) dub first-order 
logic the ‘restricted functional calculus.” (For more on the historical emergence of 
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first-order logis 
ch.7).) 

‘The carly study of first-order logic revealed a number of important features [see 
chapter 1, esp. sections 1.9, 1.10]. Giidel’s completeness theorem, frst published in 
1931, is that there is a complete, sound, and effective deductive system D for first- 
order logic: if F is a set of formulas in a first-order language and ® is a single 
formula in that language, then ® is deducible from "in D iff (if and only if) ® is 
satisfied by every model of F, It follows that first-order logic is compact: for every 
sec T of firs-order formulas, if every finite subset of Fis satisfable, then T itself is 
satisfiable, The downward Liwenbvim-Skolem theorem is that if Tis a finite or 
demumerable set of first-order formulas that is satisfied by a model whose domain is 
infinite, then Tis satisfied in a model whose domain is the natural numbers. ‘The 
upward Liwenteim-Stolem theorem is that if T isa set of first-order formulas such 
that for each natural number 1, Tis satisfied in a model whose domain has at least 
‘n elements, then for every infinite cardinal x, Tis satisfied in a model whose domain 
thas cardinality at least x. 

‘These results are sometimes called ‘limitative theorems,’ since they indicate 
restrictions on the expressive resources of first-order languages. Many central math- 
ciatical notions, such as finitude, countability, and well-foundedness cannot be 
characterized in any first-order language, nor can there be an adequate description 
of structures like the natural numbers, the real numbers, and Euclidean space. None 
of these limitative theorems apply to higher-order languages, and there are second- 
order characterizations of the aforementioned concepts and structures. Second- and 
higher-order languages thus have strong expressive resources, almost as strong as the 
informal languages of mathematics. As a result, they are difficult to study, perhaps 
intractable. Cowles (1979, p. 129) pur the situation well: 


|, see Moore (1996 [1980]), Moore (1988) and Shapiro (1991, 





It is well-known that first-order logic has 4 limited ability to express many of the 
‘concepts studied by mathematicians... [and] frs-onder logic... [has] an extensively 
developed and well-understood model theory. On the other hand, fll second-order 
logic has all the expressive power needed to do mathematics, but has an unworkable 
model theory. 


Barwise (1985, p. 5), wrote, “As logicians, we do our subject a disservice by 
convincing others that logic is first-order and then convincing them that almost 
none of the concepts of modem mathematics can really be captured in first-order 
logic.” He concluded that “one thing is certain. There is no going back to the view 
that logic is first-order logic.” 

Section 2.2 presents a brief account of the languages, deductive system, and 
semantics of second-order logic, with brief mention of higher-order logic along the 
way, Section 2.3 provides sketches of the basic meta-theory, focusing on the fate of 
the limitative theorems and the expressive resources of second-order languages. 
‘The final section concerns general issues of logic-choice that underlie the trade-off 
between higher-order logic, with its ‘semantics complex enough to say something,” 
and first-order logic with a semantics ‘simple enough to say something about’ 
(Cowles, 1979, p. 129). 
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2.2, What Higher-Order Logic is 


Here we present the basics of a higher-order logical system, including a sketch of 
several formal languages, a deductive system, and several different model-theoretic 
semantics. For a fuller treatment, see Shapiro (1991, ch. 3), 


2.2.1. Languages and deductive stems 


A formal language is, in part, a set of strings on a fixed alphabet, ‘The strings 
are called wellformed formulas (wf), oF simply formalas. For each language, a set K 
consisting of the non-legical terminology is designated. In arithmetic, for example, 
K would be (0, 5, +51, the symbols for zero, successor, addition, and multiplication, 

‘The reader is assumed to have some exposure to first-order logic; see [chapter 1}, 
Boolos and Jeffrey (1989) or Mendelson (1987), for example. Our first group 
of languages, called LIK, is firsvorder without identity. Variables are lower: 
‘ease Roman letters toward the end of the alphabet, with or without numerical 
subscripts. As noted above, these are the firsverder variables, The connectives 
are negation —, conjunction &, disjunction v, material implication —>, and the 
material biconditional =. The language has a universal quantifier V and an existen- 
tial quantifier 3. A sentence isa formula without free variables, and a theory is a set 
of formulas. 

A firstorder language with identity L1K= is obtained from L1K by adding a 
binary relation symbol =. The identity symbol is regarded as logical and is not in K. 
If and w are terms, then we abbreviate = w as ¢# 

‘The language L2K is obtained from LLK (not L1K=) by adding stock of 
relation variables and function variables. These are called sccond-order variables, 
Relation variables are upper-case Roman letters toward the end of the alphabet, and 
function variables letters like f, 4, and b. Sometimes a superscript is used to indicate 
the degree, or number of places, of each second-order variable: X' is a monadic 
predicate variable; X? is a binary relation variable; f* is a unary function variable; 
isa binary function variable, etc. In most cases, the context determines the degree 
‘of a variable, and so we omit the superscript. 

Let (0), represent a finite sequence of terms f,..., fy and let (9), represent a 
finite sequence of distinct first-order variables ¥%,-.., % The form Viv), is an 
abbreviation of Vy -- Wry. There are four new formation rules:! 





If fis an m place function variable and (), a sequence of m terms, then ft), is 
aterm. 


If R*is an m place relation variable and (#), a sequence of m terms, then RX), is 
an atomic formula. 


1f@ is a formula and V* a relation variable, then (VV"®) is a formula, 
If @ is formula and f* 2 function variable, then (V/%®) isa formula 
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‘The existential quantifiers are introduced as abbreviations: 


3V"0: WV 
370: WP 0 


Thus, for cxample, 3XWx—Xx asserts the existence of an ‘empty’ property, one 
which applies to nothing, and 3XVxXx asserts the existence of a “universal” property. 

‘The symbol for identity between (first-order) objects is also introduced as an 
abbreviation. The relevant principle is the identity of indiscernibles: 


ew WX(Xe Xu) 


in which ¢ and w are terms. This definition /abbreviation is not meant as a deep 
philosophical thesis about Identity. 
Consider the sentences: 


If [UAV y( fem fy > em y) Se Be yfye x] 
BPARV PA (Pe 8 Vie Pw -> Pfiw)) + VxPs} 


Notice that these sentences have no non-logical terms. The first ‘asserts’ that the 
domain is (Dedekind) infinite, while the second asserts that the domain is at most 
countable. 

At this stage, a symbol for identity between second-order items like relations and 
functions is not included. This avoids, or at least postpones, a sticky philosophical 
issue concerning the nature of the items in the range of the second-order variables. 
‘Among contemporary authors, Quine is a longstanding and persistent critic of 
second onder logic. One of his early attacks (Quine, 1941) targets traditional systems 
in which the second-order variables range over intensional entities like properties, 
propositional functions, or attributes. Quine is skeptical of the existence of such 
entities, since there is no consensus on which properties, say, exist, nor on conditions 
under which two properties are identical or distinct. Is the property of being. an 
equilateral triangle the same property as that of being an equiangular triangle? 
Quine’s slogan “no entity without identity” indicates that one is not entitled to 
speak of entities unless there isa clear and determinate criterion of identity on them. 
‘Quine argues that these intractable metaphysical matters should not soil our pristine 
‘work in logic and the foundations of mathematics. Trying to be helpful, he suggests 
that variables ranging over properties be replaced with variables ranging over re 
spectable extensional entities like sets, s0 it is possible to ‘identify’ the property 
which applies to m and m alone with the singleton set |m}. 

For the present study, one can think of relations and functions as extensional, or 
4s intensional, or one can leave it open. Little turns on this here and so words like 
‘property’, ‘clas’, and ‘set’ are used interchangeably.* 

Some advocates of intensional entities, ike properties or attributes, argue that 
they must be defined or constructed in levels, so that properties defined at a given 
level become available for use in definitions at later levels; see, for example, Whitchead 
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and Russell (1910) or for a readable and sympathetic development, see Hazen 
(1983). The relevant thesis is sometimes called the vicious circle principle. Accord- 
ingly, a relation is predicative, or of level 0. ifit can be defined without referring to 
relations. A relation is of level 1 if it is not predicative, but can be defined with 
reference to predicative relations only. A relation is of level 2 if it is not of level 1 but 
‘can be defined with reference to level 1 and predicative relations. To develop this, it 
might be stipulated that each higher-order variable have a numerical subscript to 
indicate its level. For example, the sentence VP,3Q,Vx(Q,x = P,x) asserts that for 
‘each level 3 predicate there isa level | predicate with the same extension. Call the 
resulting language L2pK, with the ‘p" standing for ‘predicative’. I do not have much 
to say about ramified, o predicative, languages, except by way of comparison. 

‘There are two directions for further expansion of our languages. One concems 
the set K of non-logical terminology. Second-order variables, as well as non-logical 
Predicate, relation, and function names, may be called higher-order terms, since they 
denote relations and functions. By way of analogy, this opens the possibility of non- 
logical symbols (in K) for functions on relations, etc. An example would be a 
property TWO of properties such that TWO(P) ‘asserts’ that P applies to exactly 
two things. 

AA second expansion is to introduce variables for relations on relations, functions 
of predicates, functions of functions, etc. These would be third-order variables. Then 
‘one could add non-logical constants (to K) for relations on functions of predicates, 
and the like, and one could add fourtirerder variables ranging over such things, thus 
producing fourth-order language, and so on. 

‘These higher-order languages can also be ramified. Each variable would be anno- 
tated somehow to indicate both its ype, the kinds of objects, relations, etc. it applies 
to, and its level, the place in the hierarchy at which it is defined. A type 3 level 0 
predicate would be a predicate of type 2 predicates defined by reference to type 2 
relations. A type 3 level I predicate would be a predicate of type 2 predicates defined 
by reference to type 3 level 0 relations, etc. This ramified type theory isa notational 
nightmare, 


2.2.2 Deductive stem 


Assuming that the reader is familiar with a standard first-order deductive system, this 
section presents an extension to the second-order languages L2K. First the quanti 
fier axioms and mules to the second-order quantifiers are adapted. For a Frege- 
Church type system, these would be: 


WX" X") + O(T), where T is cither an m place relation variable free for X* in 
© o¢ an m place relation letter in the set K of non-logical terminology. 

Wf" f*) +» O(p), where p is cither an » place function variable free for f* in ©, 
‘of a non-logical m place function letter. 

From © + '¥(X) infer ® > YX¥(X), provided that X does not occur free in ®, 
or in any premise of the deduction 
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From © + ¥(f) infer © > Vf¥(), provided that fdoes not occur free in ©, oF 
in any premise of the deduction 

Next there is the axiom scheme of comprehension: 


AX*V(s)( XG,  (2),) 


‘one instance for each formula ® in L2K and each relation variable X*, provided that 
X" does not occur free in ©. The scheme registers the thesis that every formula 
determines a relation or, more precisely, for every formula there is a relation with 
the same extension, 

‘The final item is a form of the axiom of choice:* 





VX™(WR) BX AF WKY LOD) 


‘The antecedent of this conditional asserts that for each sequence (1), there is at least 
‘one y such that the sequence (x),y satisfies X""'. The consequent asserts the exis: 
tence of a fiznction that ‘picks out’ one such for each (x), 

Call this deductive system D2. Recall that the language 12K does not contain a 
primitive symbol for (first-order) identity. Rather, x= is taken to be an abbrevi- 
ation of 


V(X XY) 


‘To justify this, one should derive the counterparts of the identity-axioms of 1.1 K=: 


Vax 2), which comes to Vx¥X( Xx Xx) 





‘These are tedious exercises in Frege-Church type systems, and they are virtually 
immediate in their natural deduction counterparts. 

It is straightforward, but perhaps tedious, to establish an indiscernibility principle 
for relation variables: 


an WO) POD, = Q6x),) > (LP) + (Q)) 


for each formula ® such that Qis free for Pin © P). The (meta-theoretic) proof of 
this principle proceeds by induction on the complexity of the formula ©. This partly 
justifies an extensional orientation toward the higher-order terminology. 

‘The deductive system needs to be modified for the ramified languages L2pK. 
Recall that each relation variable of L2pK is to contain a subscript to indicate its 
Jevel in the hierarchy of definition. Each relation must be definable in terms of 
variables ranging over lower-level relations. The distinctive feature of the ramified 
system is its comprehension scheme. Recall the unramified version: 
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BXVRX, = 2.) 


IF the formula @ itself contains bound higher-order variables, then the correspond: 
ing instance of the comprehension scheme is called impredicative, and it represents 
4 Violation of the ‘vicious circle principle’ that motivates the ramification. Thus, one 
can replace the unrestricted comprehension scheme with an axiom scheme of rami 
fed comprehension: 


AXV(9(X(2), = ©((3),)), provided that X, has degree m and the level of each 
relation variable (free or bound) that occurs in ® is Jes them i. 


It is straightforward to extend the other quantifier rules to the ramified system, 
First, if j= i and X, has the same degree as X, and is free for X, in ®, then 
YX(X) + (X) is an axiom. That is, if @ holds for all relations of a certain 
degree and level, then ® holds of any relation of the same degree and the same or 
ower level. The other rule, from ® > ¥ infer @—+ VX. (provided that X, docs 
not occur free in ® or in any premise of the deduction) carries over without change. 
Call this deductive system D2p. The axiom of choice is net included, since the 
philosophical tendency underlying ramified systems is to restrict the variables 10 
Aefinable relations. 

‘The aforementioned derivations in D2 of the first-order identity axioms cannot be 
carried out in D2p, since they involve the fall comprehension principle In fact, it is 
not clear that a satisfactory characterization of identity can be given in L2pK. 

Whitehead and Russell (1910) include an axiom of reducibility. 


WX/3¥QVx( You Xx) 


asserting that for every level relation, there isa predicative relation with the same 
‘extension. This has the effect of collapsing the levels and making the system equiva: 
lent to D2 (minus the axiom of choice) 

Deductive systems for the further extensions of L2K and L2pK, to third- and 
higher-order languages, are straightforward extensions of those just considered, and 
0 will not be given here. 


2.2.3. Semantics 


‘This sub-section presents three model-theoretic semantics for the unramified second: 
order languages L2K. Familiarity with standard model-theoretic semantics for frst 
‘order languages is assumed [see chapter 1], with jus this pause to establish notation. 

Each model or interpretation of the first-order L1 K or L1K= isa structure M= (4, 1), 
in which is a non-empty set, the domain of the model, and I is an interpretation 
{function that assigns appropriate items constructed from 4 to the non-logical termi- 
nology. For example, if bis an individual constant in K, then 14) is a member of d, 
and if Bisa binary relation symbol in K, then I(B) is a subset of dx d. A variable- 
fasignment ss 2 function from the variables of LLK to d. 
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For each model and assignment, there is a denotation function that assigns a 
member of the domain to each term of the language. The relation of sasisfaction 
between models, assignments, and formulas is then defined in the usual manner. If 
a model M and an assignment son Af satisfy ©, then we write M, s+ @. If M, s+ ® 
for every assignment s and every formula ® in a set T, then Mis said to be a model 
of. A formula @ is a semantic consequence of Tif for every model M and assignment 
son M, if M, 5 ¥ for every in T, then M, s¥ ©. This is sometimes written FF ®, 
‘Our first two semantics for the second-order L2K build on the semantics for its 
first-order counterpart LIX. Each model has (4, 1) as a substructure where, as 
above, d is the domain and J an interpretation of the items in K, What is added in 
teach case is a range for the relation and function variables. For the second-order 
languages, a variable assignment is a function that assigns a member of d to cach 
first-order variable and an appropriate item to each relation and function variable, 
‘The denotation function for the terms of 12K is a straightforward extension of the 
ddenoration function for LLK. The new clause is: 


Let M be a model and san assignment on M, Let f* be an m place function 
variable and (1), a sequence of m terms. The denotation of f(s), under M, sis the 
value of the function s(/") at the sequence of members of the domain denoted 
by the members of (t). 


‘There are three new clauses in the definition of satisfaction: 


If X* is a relation variable and (f), a sequence of » terms, then M, s+ X%0), if 
the sequence of members of the domain denoted by the members of (t), is an 
element of (X") 
M, 5 VX@ if M, #’F ©, for every assignment 5 that agrees with sat every variable 
except possibly X. 
M, sk Vf if M, # for every assignment 5” that agrees with sat every variable 
except possibly f° 





All that remains is to specify the range of the second-order variables. 

Our first specimen is standard semantics, which makes the logic properly sccond- 
order. A standard model of L2K is the same as a model of the first-order L1K, 
namely a structure (d, 1). A variable-assignment is a function that assigns a member 
of d to each first-order variable, a subset of a” to each m place relation variable, and. 
a function from d* to d to each m place function variable. Thus, under standard 
semantics, the monadic relation variables range over the entire powerset of the 
domain ~ every subset of the domain isin the range of these variables, Similarly, the 
binary predicate variables range over the entire powerset of d x d, one place function 
variables range over the collection of all functions from the domain to itself, etc. 

‘The notions of validity and satisfiability are defined in the usual manner: a formula 
© is standardly valid or is a standard logical truth, if M, s¥ ® for every model M and 
assignment son M;a set T of formulas is standardly sarisfableif there is an Mf, s such 
that M, 5 ® for every @ in T; and ® is a standard consequence of T if the union of 
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TT with {| is not standardly satisfiable. In the following, ‘valid’ is sometimes used 
for ‘standardly valid,” ‘satisfiable’ for ‘standardly satisfable,” etc. This reflects my 
‘own preferences, but it is also more or less the received practice. 

Notice that a standard model for L2X is the same as a mode! of its first-order 
counterpart LIX. That is, in standard semantics, by fixing a domain one thereby 
fixes the range of both the first-order variables and the second-order variables. There 
is no further ‘interpreting’ to be done. This is not the case with the next semantics, 
where one must separately determine a range for the first-order variables and a range 
for the second-order variables. 

‘The central feature of Henkin semantics is that in a given model, the relation 
variables range over a fixed collection of relations on the domain, which may not 
include all of the relations. Similarly, the function variables range over a fixed 
collection of functions on the domain. A Henkin model of L2K is a structure 
M""=(d, D, E, 1), in which d is a domain and 1 an interpretation function, as above. 
For each m, D(n) is a non-empty subset of the powerset of a* and F(x) is a non- 
empty collection of functions from d* to d. The idea is that D(n) is the range of the 
1m place relation variables and Fi) is the range of the m place function variables, 
‘A mariable-asignment is thus a function s such that s assigns a member of d to each 
first-order variable (a8 usual), s assigns a member of D(n) to each m place relation 
variable, and sassigns a member of F(1) to each place function variable, In Henkin 
semantics, then, variable assignments are restricted to those that assign members of 
the various D(n) and Rin) to the higher-order variables. The notions of Henkin- 
validity, Henkin-satigfaction, and Henkin-consequence ate defined in the straightfor: 
ward manner. 

It is immediate that a standard model of L2X is equivalent to the Henkin model 
in which for each m, D(n) is the powerset of d*, and F(n) is the collection of all » 
place functions from d* to d. Such Henkin models are sometimes called full-models 
Let M be a standard model and Af’ the corresponding fill-model. ‘Then for each 
assignment s and each formula ©, M, s@ under standard semantics iff M’, st © 
‘under Henkin semantics. Thus, if a formula @ is a Henkin-consequence of a set T, 
then @ is a standard consequence of [. Section 2.3 shows that the converse fails. 

‘On both of these semantical systems, the items in the range of higher-order 
variables are extensional entities ~ ether sets or functions. However, since there is 
‘no symbol for ‘higher-order identity’ in the language, one is free to maintain an 
intensional understanding of the higher-order entities. For the purposes of model- 
theoretic semantics, sets can serve as surrogates for the relevant attributes, proper: 
ties, or propositional functions. Henkin semantics might be attractive to an advocate 
of intensional items. Such a philosopher might suggest that in a given Henkin 
model, the specified range of the monadic relation variables, for example, would be 
the collection of sets that are the extensions of the relevant attributes, properties, or 
propositional functions (Cocchiarella, 196 [1988)}). If our advocate of intensional 
items believes that for every arbitrary collection S of m-tuples on the domain, there 
is an attribute whose extension is $ (and similarly for functions-in-intension), then 
she will fivor standard semantics 

Boolos (1996a [1984], 1996b [1985]) has proposed an alternate way 10 under- 
stand atleast the monadic second-order relation variables. According to both standard 
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and Henkin semantics, a monadic, second-order existential quantifier AX can be 
read 


‘There isa set X. 
or 
‘There is property X. 


in which case, of course, the locution invokes classes or properties. Against this, 
Boolos suggests that the quantifier be considered a counterpart of a plural quantifier, 


‘There are (objects) 
in natural language. Consider this sentence: 
Some critics admire only one another. 


It has a (more or less) straightforward second-order rendering, taking the class of 
critics to be the domain of discourse: 


BN(BRNe & Vey (Xx & Any) + (x4 y& X3))) 
According to standard or Henkin semantics, the formula would correspond t0 


‘There is a non-empty class (or property) X of critics such that for any xin X and 
any y, if x admires y, then x y and y is in X. 


But this implies the existence of a class (or property), while the original ‘some critics 
admire only one another’ does not, atleast prima facie. 

Natural languages, like English, allow the plural construction and, in particular, 
English contains plural quantifiers like 


‘There are some dogs that like each other and hate most cats, 


Boolos argues that the plural construction be employed in the meta-language used 
in developing formal semantics. The relevant locution is 


‘There are objects X, such that. 


AAs in the first-order case, the variable serves as a place-holder, for purposes of cross 
reference, much like a pronoun. Construed this way, a monadic second-order lan: 
guage has no ontology beyond that of its first-order counterpart. In set theory, for 
example, the Russell sentence, 3XVx( Xx= x € 2), is a consequence of the compre- 
hhension scheme. According to standard semantics, it entails that there is a class that 
is not coextensive with any set in the domain. Admittedly, this takes some getting 
used to. On Boolos’s interpretation, however, the Russell sentence reads 
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There are some sets such that any set is one of them just in case it is not a 
member of itself 


which is a harmless truism. Omitting the details of Boolos’s (1996b {1985]) rigor- 
‘ous, model-theoretic semantics for second-order languages with monadic relation 
variables, itis worth noting that the end result has similar meta-theoretic properties 
to standard semantics, as presented here. 


2.3. Meta-Theory and Expressive Resources 


This section presents some of the main results concerning the ability of second- 
‘order languages to capture central mathematical structures and notions. For more 
detail, see Shapiro (1991, ch. 4, 5). 


23.1. Henkin semantics 


[As presented, Henkin semantic is not sound forthe deductive sytem D2. Although 
itis routine to verify that every Henkin model satisfies the axioms and rules of a 
firs-order deductive system and the instances ofthe second-order quantifier axioms 
and rules, some Henkin models do not satisfy the comprehension scheme. Consider, 
for example, a structure M= (d, D, F, I) in which dis a set with two members a# &; 
DQ) has a single member, the relation ((a, a), (by a)); and AL) has a single 
member, the identity function. Then Mf does not satisfy the following instance of the 
comprehension scheme: 


AXWe¥ 9 Xay = x) 


In effect, this axiom asserts the existence of an empty binary relation, but Af does 
not have one. Similarly, M, s does not satisfy the axiom of choice 

Define a Henkin model to be faithful to D2, or simply fnithful if it satisfies the 
axiom of choice and every instance of the comprehension scheme. That is, a Henkin 
‘mode! is faithful if it contains every relation definable via the comprehension scheme 
and the functions promised by the axiom of choice. All subsequent discussion is 
restricted to faithful models. 

‘Soundness is now immediate: 





Theorem 2.1 Soundness If Ty, ®, then @ is satisfied by every faithful Henkin 
model that satisfies every member of T 


In most respects, languages like L2K under Henkin semantics are like first-order 
languages. In particular, Henkin semantics is complete, compact, and satisfies both 
Lowenheim-Skolem theorems. 
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‘Theorem 2.2 Completeness (Henkin, 1950) Let T be a set of formulas of L2K. 
IFT is consistent in D2, then there is 2 faithful Henkin model that satisfies T. 
Equivalently, for every formula ® and set T, T Fos ® if M, s* @ for every faithful 
Henkin model M and assignment s that satisfies every member of T. 


‘The proof is a straightforward adaption of the Henkin construction establishing 
Gdel’s theorem for the completeness of first-order logic (Shapiro (1991, pp. 89= 
as 

If the set K of non-logical terminology is countable, then the constructed Henkin 
‘model might be called doubly countable in that the domain and each Dn), Fin) is 
cither finite or denumerably infinite. That i, the constructed mode! has only countably 
‘many relations and functions. Since each infinite domain has uncountably many 
relations, some (indeed, most) are notin range of the higher-order quantifiers of the 
indicated Henkin model. Thus, if the constructed Henkin model has an infinite 
domain, itis not a fall model 

‘As in the first-order case, compactness is a corollary of completeness 


‘Theorem 2.3 Compacmess Let Tbe a set of formulas of L2K. If every finite 
subset of Fis satisfiable in a faithful Henkin model, then T itself is satisfiable in a 
faithful Henkin model 


Let M= (d, D, F, 1) be a Henkin model. Define M’ = (d', D’, F’, I’) 0 be a Henkin= 
submodel of Mit 


Ldtca 

2. for each natural number 1, each relation in D/(n) is the restriction to d’ of a 
relation in D(m) 

3. each function in F(n) is the restriction to 4” of a function in Fn) 

4 I’ and I assign the same elements to ach individual constant, and 

5 the interpretation of each predicate, relation, and function symbol under I” is the 
restriction to d’ of its interpretation under I 


‘Theorem 2.4 Downward Liwenbcim-Skolem theorem Let M=(d, D, F, 1) be 
a Henkin-model of L2K. Then there is a Henkin-submodel M' = (a", D/, F’, 1") 
of M such that 


1d and each D/(m), F'(n) ae all at most denummerably infinite (or the cardinality 
of the set K, if K is uncountable), and 

2. if @ is any formula and s” any variable assignment on M’, then there is a 
corresponding variable assignment son M such that M', ¢ + @ iff M, s¥ 


In other words, for every Henkin model Mf there is a countable Henkin-submodel 
AC such that Mf is equivalent to Mf” 


‘The proof is a more or less straightforward adaptation of a proof of the correspond- 
ing first-order theorem; see Shapiro (1991, pp. 93-4); [sce also chapter 1]. It makes 
essential use of the axiom of choice (in the meta-theory) 
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[As with the first-order case, the next theorem is a corollary of compactness: 


‘Theorem 2.5 Upward Liwenbeim-Stolem theorem Let Tbe a set of sentences 
of L2K. If, for each natural number m, there is a faithful Henkin model of 
whose domain has at least m clements, then for every infinite cardinal « there is, 
a fithful Henkin model of F whose domain has at least x-many elements. 


A set P of sentences is categorical if any two models of T are isomorphic. As in the 
first-order case, theorems 2.4 and 2.5 indicate that no theory with an infinite Henkin- 
model is what may be called ‘Henkin-categorical” Thus, second-order languages 
with Henkin semantics are not adequate to characterize infinite structures up to 

In this regard, the results reported in this subsection indicate that second-order 
languages with Henkin semantics are much like first-order languages. One can think 
of a Language like L2K as a multi-sorted first-order language, with the predication 
(or membership) relation between objects and relations (or sets) as non logical ~ like 
the members of K. Shapiro (1991, ch. 3, 4) develops a semantics along these lines 
and shows it to be equivalent to Henkin semantics. In particular, each mult-sorted 
first-order model is equivalent to a corresponding Henkin model, and vice versa 

When it comes to first-order languages, variables can range over any type of 
entity, so long as there are coherent things to say about them. Unless a philosopher 
has qualms about the existence of relations or sets on a domain 4, there can be no 
objection to a language like L2K. However, with Henkin semantics (restricted to 
faithful models), the only relations/sets that are assumed to exist are some choice 
functions and those relations/sets definable in the language. The distinctive expres- 
sive resources of higher-order logic come when one assumes that the second-order 
variables range over every function and relation /set on d. Thus the next subsection 
turns to standard semantics. 





23.2. Standard semantics 


‘One important meta-theorem does carry aver from first-onder logic to second-order 
logic with standard semantics. As noted above, D2 is not sound for Henkin seman- 
tics, However, since full Henkin models are faithful, D2 is sound for standard 
semantics 


‘Theorem 2.6 Soundness Let € be a set of formulas and ® a single formula of 
12K. If @ can be deduced from Fin D2, then ® is a standard consequence of 


Like the first-order case, the proof isa straightforward check of each axiom and rule 
of inference. Most of the axioms and rules require no substantial assumptions about 
the set-theoretic universe underlying the model theory. The clause involving the 
comprchension scheme uses a principle of separation in the background set theory, 
and the clause for the axiom of choice uses 2 principle of choice in the meta-theory 
‘The apparent circularity is the same as in the first-order case, One uses principles in 
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the meta-theory to show that certain axioms and rules are sound. Readers who are 
uncomfortable with separation or choice can remove those principles from D2. 

‘The primary item here is the refutation of completeness, compactness, and the 
Léwenheim-Skolem theorems for standard semantics. The key items are the exis- 
tence of categorical axiomatizations of the natural numbers and the real numbers, 
and Gédel’s theorem on the incompleteness of arithmetic. 

‘The language of arithmetic has A= (0, 5, +,-| as its set of non-logical terminol- 
ogy. The following axioms are first-order. 


Successor axiom: Ve(se# 0) & VV y(se= 99> x= 7) 
Addition axiom: Wx{x+0=3) &VaN9(x+ 92 (2+) 


Multiplication axiom: \/x{x-0= 0) & Wx¥y(x- sy=3- y+) 





‘Then there is the inductiom axiom, a proper second-order statement: 
Induction axiom: X|(XO & Wai Xx > Xex)) + VaXe] 


Let AR (for ‘arithmetic’) be the conjunction of these four axioms. 

Let N be the model of L2A whose domain is the set of natural numbers and 
which assigns zero to 0, and assigns the successor function, the addition function, 
and the multiplication function to s, +, and -, respectively. This, of course, is the 
intended interpretation, Then N+ AR. The next theorem is that, in an important 
sense, N is the only (standard) model of AR: 


‘Theorem 2.7 Categoricity of arithmetic (Dedekind) Let M1 = (4), 1h) and 
‘M2 = (dy, 1) be two (standard) models in the language L2A. If M1 + AR and 
M2 AR, then MI and M2 are isomorphic there is a one-to-one function f from 
44, onto dy that preserves the structure of the models. 


(See Shapiro (1991, pp. 82-3) for details of the proof, although Dedekind (1963 
[1888}) remains a readable source.) It follows from theorem 2.7 and the fact that 
NF AR that if Mis a model of AR, then the domain of M is denumerably infinite. 


Corollary 2.8 Let ® be a sentence of L2A. Then © is true of the natural 
numbers (ie., Nt ©) iff AR @ is a (standard) logical truth, 


In real analysis, the non-logical terminology is B= (0, 1, +, -, =]. The first axioms 
are those of an ordered field, all of which are first-order.* The sole second-order 
statement is the axiom of completencs, asserting that every bounded property (or set) 
hhas a least upper bound: 


YXIByXy BBV X99 75 2) 9 LWA AG 7S REVSLYAY 9 7S 2) 4.05 3) 


Let AN (for ‘analysis’) be the conjunction of the axioms of real analysis. The real 
‘number structure constitutes the intended model, and AN is categorical: 
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Theorem 2.9 Categoricity of analysis Let M1 and M2 be two models in the 
language L2B. If M1 AN and M2 AN, then AfL and M2 are isomorphic. 


‘Sce Barwise and Feferman (1985, p. 84) for a sketch of the proof. 

‘We now refute the analogues of theorems 2.2-5 for L2K with standard semantics. 
‘The Léwenheim-Skolem theorems are casiest. Second-order arithmetic, AR, is 
categorical (theorem 2.7). It has denumerably infinite models and no uncountable 
models. Second-order analysis, AN, is also categorical (theorem 2.9). It has un 
countable models and no countable models. Thus, 


‘Theorem 2.10 Both of the Lowenheim-Skolem theorems fail for second-order 
languages with standard semantics. 


‘The following purely logical sentence 
(FIN) Vf(VaVy fie= fy > x= 9) & Be¥y fy) 


asserts that there is no one-to-one function from the domain to a proper subset of 
the domain, Thus, FIN is satisfied by all and only those models whose domains 
are (Dedekind) finite. The compactness of first-order logic and Henkin semantics 
for L2K indicates that there is no characterization of finitude in those systems. 
‘The upward Lowenheim-Skolem theorem entails that any formula that is satisfied 
in every finite model also has (arbitrarily large) infinite models. 

Let Tbe the set of sentences consisting of FIN and 


[Be Ae(x # x), Se nBy(x, #5 Sea #8 AH). | 


In other words, the set F contains a formula asserting that the domain is finite and, 
for each narural number 1, T contains a formula asserting that the domain has at 
least m elements, Thus, Tis not satisiable. Let [” be any finite subset of I, and let 
rm be the maximum number of occurrences of the existential quantifier in any one 
member of T”. ‘Then any structure whose domain has at least m elements satisfies 
every member of I”. Thus, I” is satisfible and we have: 


‘Theorem 2.11 Standard semantics for L2K is not compact, 


‘Our last item is the refutation of completeness. Let D be any effective deductive 
system that is sound for the language L2 (under standard semantics). Consider the 
set 


P= (|b (AR +9). 


Since Dis effective, the set I's recursively enumerable and, from corollary 2.8, every 
member of I is true of the natural numbers. Gode!’s incompleteness theorem ~ see, 
for example, Gadel (1965 [1934]; [see chapter 4] — is that the collection of true 
first-order sentences of arithmetic is not recursively enumerable. So let be a true 
sentence of first-order arithmetic that is not in T. Then, by corollary 2.8 (AR >) 
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is a (standard) logical truth, but (AR—+ ) is not derivable in D. Thus, there is no 
effective, sound deductive system that is complete for standard semantics: 


Theorem 2.12 Let D be any effective deductive system that is sound for L2A. 
‘Then Dis not weakly complete: there is a logical truth thar is not a theorem of D. 
In short, standard semantics is inherently incomplete. 


A fortiori, D2 is incomplete. 


2.3.3. First-order and (standard) second-order theories 


In first-order arithmetic, the (second-order) induction principle is replaced by a 
scheme. If is a formula in the language L1A® of first-order arithmetic, then 


(®(0) & Vx(@(x) + (se) > Vx(x) 


is an axiom scheme of firs-order arithmetic. The theory thus has infinitely many 
axioms. 

Similarly, in the characterization AN of real analysis, the second-order item is the 
principle of completeness, stating that every bounded non-empty set of real numbers 
has a least upper bound. First-order real analysis is obtained by replacing the single 
completeness axiom with the completeness scheme, 


x(x) ALY) + 9 x) FB VHOL) + 9S 0 & WeLYHOL) + 9 9) 8) 


fone instance for cach formula ® of the language L.1 B= of real analysis that contains 
neither x nor s free. 

‘The difference between, say, second-order real analysis and its first-order counter- 
partis that, in the latter, one cannot directly state that every non-empty bounded set 
has a least upper bound. The closest one can come is a separate principle for each 
such set which is definable by a formula in the language of first-order analysis. The 
mathematician who uses first-order analysis thus cannot apply the completeness 
principle to a set until she is assured the set is definable in the relevant language. 
Since there are sets which are not definable, the first-order theory has models which 
are not isomorphic to the real numbers. These are sometimes called non-standard 
‘models. Indeed, the Lowenheim-Skolem theorems indicate that for every infinite 
cardinal x, there are models of first-order arithmetic and models of first-order anal 
ysis whose domain has cardinality x. The study of non-standand models has proven 
fruitful in illuminating the original informal theories. 

[As noted above, there is a second-order characterization of finitude, but no firs 
order characterization. There are ao second-order characterizations of countabilty, 
well-foundedness, minimal closure, and many other central mathematical notions. It 
follows from the compactness of first-order logic that there are no first-order (or 
Henkin-second-order) characterizations of these notions. Any consistent attempt to 
formulate such 2 characterization of these notions in a firs-order (or Henkin 
second-order) theory will have unintended models that miss the mark. 
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Lindstrim (1969) showed that, in a sense, the limitative properties characterise 
first-order logic. Let L be any model-theoretic type of logic and assume that Z has 
the property of the downward Lowenhcim-Skolem theorem. If L is also compact, 
then there is a sense in which Lis equivalent to first-order logic: L cannot make any 
distinction among models that cannot be made with the corresponding first-order 
language. For better or worse, then, standard semantics is what makes second-order 
logic distinctive, The categoricity results and the concomitant failure of the limitative 
properties are the source of both the expressive strength and the main shortcoming, 
of second-order logic. 


24, Philosophical Issues: How to Pick a Logic 


It is widely agreed ~ at last implicitly ~ that mathematicians succeed in describing 
various notions and structures up to isomorphism and they succeed in communicat 

ing information about these structures to each other. Most agree that when math: 

‘maticians refer to the ‘natural numbers,” the ‘real numbers,” etc., they are talking, 
about the same structures. Similarly, there is lite doubt among mathematicians and 
‘most philosophers that notions like finitude and well-foundedness are clear and 
unequivocal. In other words, there is a near (but not universal) consensus that the 
informal language of mathematics has expressive resources sufficient for the ordinary 
description and communication of these basic structures and notions. 

Advocates of higher-order logic argue that to capture the semantics of the 
informal languages of mathematics, a logical system should register the successful 
description and communication of mathematical structures and concepts. The desid- 
cratum is that the expressive resources of the formal languages should match those 
of the mathematical discourse it models, Wang (1974, p. 154) takes a similar line: 


When we are interested in set theory or clanical analysis, the Lémenheim-Skolem 
theorem is usually taken a soe of defect (often thought to be inevitable) ofthe fis 
‘order logic... [What is established (by Lindstrom’ theorems) isnot that first-order 
logic is the only pomible logic but rather thai isthe only powuble logic when we in a 
sense deny reality to the concept of uncountable 


See also Montague (1965), Corcoran (1996 [1980]), Isaacson (1987), and Shapiro 
(1991, ch. 5). 

Kreisel (1967) delimited some epistemological features of the second-order 
axiomatizations of mathematical theories. He argued that relying on infinitely many 
axioms presented via an axiom scheme is unnatural, Suppose, for example, that 
someone is asked why he believes that each instance of the completeness scheme of 
first-order real analysis is true of the real numbers. The theorist cannot give a 
separate justification for each of the infinitely many axioms. Nor can he claim that 
the scheme characterizes the real numbers since, as already seen, no first-order 
axiomatization can characterize this structure. Kreisel argued that the reason math- 
tematicians believe the instances of the axiom scheme is that each instance follows 
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from the single second-order completeness axiom. It is not clear that the informal 
generalization over the instances of a scheme is any less problematic than the explicit 
‘generalization over properties or sets in the second-order axiom. 

‘A related problem is that each first-order scheme is tied to the ingredients of 
the particular first-order language in use at the time. Mathematicians, however, are 
quick to apply the induction or completeness principles to sets regardless of whether 
they are definable in the given first-order language. Indeed, they usually do not 
check for definability in this or that language. This is manifest in the practice of 
‘embedding structures in each other. For example, when one sees that there is a 
structure isomorphic to the real numbers in the set-theoretic hierarchy, one can use 
set theory to shed light on the real numbers. This works by applying the complete: 
ness principle of analysis to sets of real numbers definable in set theory, whether or 
not such sets can be defined in the language of analysis. One cannot tell in advance 
‘hat resources are needed to shed light on a mathematical structure ~ even after the 
structure has been adequately characterized. 

In the section on second-order logic of is (otherwise) influential textbook, Church 
(1956, p. 326n) remarked: 


‘our definition of the [standard second-order] consequences of a system of postulates 

‘can be seen to be not essentially different from [hat] required forthe ... treatment 
Of classical mathematics... It is tre that the non-effective notion of consequence, as 
‘we have introduced it... presupposes a certain absolute notion of ALL propositional 
functions of individuals. But this i presupposed also in classical mathematics, especially 
classical analysis 


On the other hand, the expressive power of second-order languages under standard 
semantics carries a cost. For example, there is a single second-order sentence x that 
is a categorical characterization of the first inaccesible rank of the set-theoretic 
hierarchy. Virtually any truth of any branch of mathematics short of set theory is a 
logical consequence of x, and corresponds to a logical truth in the form x. So 
mathematical truth for just abour any branch of mathematics (short of set theory) 
‘ean be reduced to second-order lagical truth, Moreover, there is a second-order 
sentence, with no non-logical terminology, that is equivalent to the generalized 
continuum hypothesis. The sentence is logically true if and only if the generalized 
continuum hypothesis holds; see Shapiro (1991, ch. 5). We have already encoun- 
tered a second-order sentence equivalent to the axiom of choice. It is perhaps 
counterintuitive to hold that these mathematical statements are principles of legic. 

For these reasons, some authors argue that second-order logic (with standard 
semantics) is not logic at all, but is rather an obscure form of mathematics. A first- 
order theory, with an intended interpretation, has variables ranging over a domain: 
of-discounse. A second-order theory has variables ranging over the entire powerset 
of the domain, a larger collection. Quine (1986, ch. 5), for example, argues that 
with the presence of set-theoretic notions in the language, one crosses the border 
‘out of logic, into mathematics. He calls second-order logic “set theory in disguise,” 
a wolf *in sheep's clothing.” The idea, presumably, is that proper mathematics is 
powerful and awe-inspiring; logic is, or ought to be, a mere ‘sheep’: “Sct theory's 
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staggering existential assumptions are... hidden .... in the tacit shift from schematic 
predicate letter to quantifiable set variable” (p. 68).* 

‘Come right back. A popular theme in contemporary philosophy, a least in North 
‘America, is that there are no sharp borders between disciplines. Quine himself is a 
champion of this idea, arguing that in the seamless “web of belie there is no sharp 
distinction - no difference in kind ~ between mathematics and, say, zoology. Math- 
matics occurs throughout the “web.” Why should logic, especially the logic of 
mathematics, be different? Why expect logic to be free of substantial expressive 
resources? 

In short, most partes to this debate agree that the informal language of math- 
‘matics is adequate to describe and communicate the various structures and notions. 
‘The second-order camp follows Church and accepts a mathematically rich language, 
while the first-order camp, led by Quine, argues that logic must fail where informal 
‘mathematics succeeds.” Perhaps the boundary dispute is not of much interest. AS 
Jong as no deceit is intended, one can apply the honorific label “logic’ at will. The 
deeper issues concern the purposes of logical study. Historically, one goal of logical 
study was to present a canon of inference. The plan would be to present a calculus 
Which codifes correct inference pattems. From this perspective, second-order logic 
4 non-starter. It follows from Godel's completeness theorem that the consequence 
relation of fist-onder logic i effective, but a8 we have seen, second-order logic (with 
standard semantics) is inherently incomplete. Indeed, the set of second-order logical 
truths is not even definable in (second-order) arithmetic. 

‘The presentation of calculi as canons of inference do not exhaust the traditional 
scope of logic. It is widely (but not universally) held that deductive systems must 
themselves adhere to a prior notion of larical consequence. Must this prior notion be 
cffective Informally, logical consequence is sometimes defined in terms of the mean 
ings of a certain collection of terms, the so-called “logical terminology.’ This is 
consonant with the slogan that logical consequence is a matter of ‘form.’ Most 
theorists agree that the truth functional connectives ('-,” *&,"‘v," +4," =") and the 
first-order quantifiers are logical. From this perspective, the issue of second-order 
logic is whether the membership (or predication) relation and bound variables rang- 
ing over relations or clases are logical. We have here another border dispute, but 
pethaps one can be eclectic. In correspondence, Tarski once wrote (1987, p. 29): 











‘sometimes it seems...convenient to include mathematical terms, like the [member 
ship] relation, in the class of logical ones, and sometimes I prefer to restrict myself to 
terms of ‘elementary [t.., first-order] logic.” Is any problem involved here? 


Let a thousand flowers bloom. 


‘Suggested further reading 
‘The technical development of highcr-ordcr logic can be found in some textbooks on 
mathematical logic. Hilbert and Ackerman (1950 [1928]) remains a readable source, while 
Church (1956) and Boolos and Jeffrey (1989) provide 4 more contemporary perspective. 
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Van Heijenoort (1967) collects together many ofthe seminal historical works on the subject 
(ia English translation). My own Shapiro (1991) consists ofa detailed development of second 
conde logic, and a defense of its use in foundational studies, as well as an extensive bili 
‘ography. Quine (1986, ch. 5), Jané (1993), Tharp (1996 [1975], and Weston (1996 [1976]) 
constitute a sample of the opposition to second-order logic. The Shapiro (1996) anthology 
contains reprints of journal articles both defending and attacking higher-order logic, and 
articles on the Skolem paradox. Shapiro (1999) is a reply to some of the critics of Shapiro 
(1991), and includes references to some of the literature. Lewis (1991) includes an informa 
tive treatment of the philosophical issues conceming plural quantification, The contrasting 
shortcomings of first-order logic and higher-order logic has led to the development and study 
of systems that arc, in a sense, intermediate berween them. See Shapiro (2001) fora broad 
overview and Barwise and Feferman (1985) for some highly technical samples. 


Notes 


1 For the sake of readability, 1 do not always note the distinction between object language 
Yaables (like %) and meta-variables (like ») that range over object language variables, 
Context wil indicate which is meant. Incidentally, no generality would be lost if we let 
‘out function variables, since an W+ 1 place relation variable can serve as a surrogate for an 
m place function variable. 

2. If presed, 1 would follow Quine and adopt an extensional interpretation. One option 
‘would be to take the identity sign as an abbreviation, using the principle of extensionality 


PaQm Vii, = QU) and foam VORLSe,=A02) 


3 The axiom of choice has a troubled history, bur it now is esiential to most branches of 
‘mathematics, including mathematica! logic. Ifthe akiom of choice is dropped from the 
deductive system, it should be replaced with a principle of comprehension for functions 
(or one could just omit the variables ranging over functions). 

4 In presentation to a conference honoring Alonzo Church (Buffalo, May 1990), Henkin 
remarked he first discovered the completeness of higher-order logic under (what is here 
called) Henkin semantics, and only later adapted the proof to the fint-order case. 

8 The axioms for an ordered field are that addition and muliplication are associative and 
‘commutative, multiplication is distributive over addition, 0 isthe additive identity, 1 isthe 
multiplicative identity, every element has an additive inverse, every element but 0 has a 
‘multiplicative inverse, < is 2 linear order, and the elements greater than or equal to 0 
are closed under addition and multiplication. See any treatment of abstract algebra, for 
‘example MacLane and Birkhoff (1967, ch. IV, V). 

6 Historically the “shift” went in the other direction, from the “quantifiable variable’ of 
higher onder logic to the noe-logical schemes of first-order logic. 

7 There i another, currently les popular line that rejects the presupposition that the 
‘mathematical structures and notions in question are unequivocal. The Lowenbeim-Skolem 
theorems indicate that there are no unambiguous notions of ‘nite’ ‘countable’, ‘natural 
‘number’, ete. This skeptical view is sometimes called ‘Skolemite relativism’. The under: 
Iying thesis is thar the model theory of first-order logic (or Henkin semantics) accurately 
reflects the ontological/epistemic/semantic situation. There is nothing unequivocal 10 
capture or describe with either formal or informal languages of mathematics. So advocates 
Of this relativism favor first-order logic. 
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Chapter 3 


Set Theory 
John P. Burgess 


Set theory is the branch of mathematics concerned with the general properties of 
aggregates of points, numbers, or arbitrary clements. It was created in the late 
nineteenth century, mainly by Georg Cantor. After the discovery of certain contra: 
dictions cuphemistically called paradoxes, it was reduced to axiomatic form in the 
carly twentieth century, mainly by Emst Zermelo and Abraham Fraenkel. Thereafter 
it became widely accepted as a framework ~ or ‘foundation’ ~ for the development 
of the other branches of modern, abstract mathematics. Today, its basic notions and. 
notations are widely used even outside mathematics proper. 

Set theory impinges on philosophy in several ways, First, the more formal areas of 
philosophy are among the areas outside mathematics proper where set-theoretic 
notions and notations are used. Further, the main topic of set theory (considered as 
a branch of mathematics in its own right rather than a framework for developing 
fother branches of mathematics) is infinity, traditionally a topic of philosophical 
speculation, Finally, the fact that mathematics is in some sense reducible to set 
theory ~ though the specification of just what sense this is remains itself a philo- 
sophical problem — means that many problems of philosophy of mathematics reduce 
to problems of philosophy of set theory. 


3.1. Basic Notions and the Algebra of Sets 


A.setis one thing composed of many things, its clements, the relation of element a 
to set A being written #€ A, with the negation abbreviated a @ A. When every 
element of the set A is also an clement of the set B, A is said to be a subact of B, 
written AC B. The mathematical notion of set is distinguished from the meta 
physical notion of property by the axiom of extensionality, according to which two 
sets having exactly the same elements are identical 

It follows that for any condition ®{x) there can be at most one set whose ele- 
‘ments are all and only the x for which the condition holds. This set és denoted 
{]}0(4)], and several variations on this notation are also used. Thus if w isa set and 
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(x) a condition, then the result of separating out from the given set just those of 
its elements for which the given condition holds, that is, [x] x€ # and ©(«)), is 
written (x€ u| (x)). Thus also, if «is a set and ‘¥(x, y) a condition such that for 
every « there is a unique y= y(x) for which '¥(x, 9) holds, then the result of replac- 
ing the elements of the given set by the elements associated with them by the given 
condition, that is, {y|¥(x 9) for some x w), is written (w(x) |x€ a]. Also, if 
By are any given finitely many elements, the set [x| =a, or x= m 
or x= a4) is writen [4,5 ay)- In particular, the empty set [x # x}, the 
singicoon set |x|x= a}, and the (unordered) pair |x|x= 0 oF x= 6} are written [, 
{a}, and (a, 6 

Widely used basic notions of set theory include the following. 






+ The intersection AN) Bof two sets A and Bis the set whose elements are all and 
‘only those » that are elements both of A and of B. 

Two sets are called disjins if their intersection is empty. 

+ The snion AU B of two sets A and Bis the set whose elements are all and only 
those x that are elements either of A or of B or of both. More generally, the 
union UX of a set X of sets is the set whose elements are all and only those x 
such that x€ A for some A X. 

© The difference A~ B of two sets A and Bis the set of all x that are elements of 
A but are not elements of B. 

# The power set o(U) of a given set Us the set of all ts subsets. 


In a context where one is concerned only with subsets of U, the difference U~ A 
may be written simply ~A and called the complement of A. 

Various basic ‘algebraic’ laws of intersection, union, and complement, follow 
immediately from the foregoing definitions and the logical properties of conjunction 
“and,” disjunction ‘or,’ and negation ‘not.’ (Por example, ~(4 1 B) =A U -B.) The 
derivation of such laws of the algebra of sets constitutes the first chapter in the 
development of set theory. 


3.2. Further Basic Notions: Relations and Functions 


‘The derivation of the basic laws of relations forms the second chapter in the devel 
opment of set theory. But prior to the notion of relation comes the notion of the 
ordered pair (a, b) with first component a and second component b, which is subject 
to the following basic law: 


(4, 6) =(6, @) if and only if a= ¢ and = d. 


A (binary) relation on a set A (respectively: between elements of one set A and 
those of another set B) is then any set of ordered pairs whose two components are 
both elements of A (respectively: whose first component is an clement of A and 
whose second component is an element of B}. One often writes aRb for (a, 6) € R. 
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‘One may also consider ordered triples and trinary relations, ordered quadruples and 
‘quaternary relations, and so on. 


+ If R is a relation, the converse (also called inverse) relation Ris the relation 
(a, 6)|6Ra}. Thus if R is the relation of parent to child, R* would be the 
relation of child to parent. 

+ IPR and Sare relations, the composition SR is the relation (a, 6)| for some b, 
Rb and OS), Thus if Sis the relation of parent to chi, and R the relation of 
sibling co sibling, then S» R would be the relation of aunt or uncle 10 niece or 


nephew. 
© The domain of a relation R is (a| aRb for some b| and the range is (| aRb for 
some a} 


Various basic ‘algebraic’ laws follow immediately from the foregoing definitions and 
logic. (For example, (S* RY'= RS) 

‘Notions and results pertaining to several special kinds of relations are also widely 
used outside mathematics proper, and throughout in set theory itself, and some of 
these may be collected here for future reference. One important special kind of 
relation is an equivalence relation E on a set A, by which is meant any relation for 
which the following three properties hold for all a,b, ¢ © A: 





1 Reflesivity aRa. 
2 Symmetry If aRb then bRa. 
3 Transiivity If aRb and bRe then aRe. 


‘The set (| aBb) is then called the equivalence class of a. 


© A selector for an equivalence E on A is a subset Sof A having exactly one 
clement from each equivalence class. 

© A partition of set A is a set X of subsets of A such that any two distinct sets in 
X are disjoint and the union of all the sets in X is all of A. 


‘The notions of equivalence and partition are linked by the fact that if E is an 
equivalence on A, then the set of E-equivalence classes i a partition of A, while if X 
is a partition of A, then the relation ad defined to hold if any only if @ and b belong, 
to the same set in X is an equivalence on A. 

Other special kinds of relations are ondering relations: 


+ A parsiat order R on a set A's any relation with the three properties of reflexiv- 
ity, transitivity, and anti-smmctry, where the last requires that for all a, bE A, 
if aRb and bRa, then «= 6. 

* A (twtal) order is a partial order with the additional property of connectedness, 
requiring that forall a, 6 A, either ab or bRa. 

+ Associated with any partial or total order R is the srict partial or total order R 
sven by: aR’ iff (if and only if) aRb but not a= b 
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When the order R is written =, the associated strict order R’ is written <. Termi- 
nology such as “least” and ‘greatest’ is used in connection with orders in the obvious 


‘+A well-order R on a set A is an order with the property that every non-empty 
subset of A has an R-least clement. (For example, the usual order on the natural 
‘numbers N is a well-order in this sense.) 


Pethaps the most important special kind of relations are functions. Here a function 
from a set A to a set Bis a relation f between clements of A and elements of B such 
that for every @€ A there exists a unique 6 B such that afb. This unique b is 
written fla) 


* If UC A, then the image f[U] of U under fis defined to be | fla)|a€ U}, 
and if VC B, then the pre-image f"\[V] of V under fis defined to be a] 
flaye VI. 

+ Ihfor every bE Bthere exists at most one a € A with f(a) = 0, then the function 
is said to be one-to-one or an injection. 

*Iffor every #€ B there exists at least one a € A with f(a) =, then the function 
is taid to be onto or a murjection 

‘+ A function that is both an injection and a surjection is called a bijection. 

+ A bijection from a set to itself is called a permutation 
A (binary) operation on. A is a function from the set of ordered pairs of elements 
of A to A itself, and if § is such an operation, one often writes «$a; for 
§(4,, 4) (For example, addition and multiplication on any of the usual number 
systems of algebra are operations in this sense.) 





3.3. Transfinite Cardinals 


‘The greatest novelty introduced by Cantor was his theory of transfinite numbers. He 
defined two sets A and B to have the same cardinal number A=) Biff there 
exists a bijection from A to B. He also defined a = to mean that there are sets A 
and B with | Al|=« and | 8 =B having AC 8. This notion = has the properties of 
« partial order. (The anti-symmetry property, that if @ = B and <a then a=, is 
4 substantial result called the Cantor-Bernstein Theorem.) The natural numbers are 
also called the finite cardinals, the cardinal number of the set (0, 1,...,"=1) of 
natural numbers Jess than a given natural number m being identified with m itself 
Cantor called the cardinal of the set N of all natural numbers No, and the cardinal of 
the set R of all real numbers ¢. Sets of cardinal N, are called denumerable, and sets 
that are cither finite or denumerable are called countable. 

‘Cantor also introduced an arithmetic of cardinal numbers. First, auxiliary notions 
of addition, multiplication, and exponentiation on sets may be defined as follows: 


© Addition A+ B=((0, a)|a€ Aj U (1, 6) bE Bi 
+ Multiplication A-B= (a, #)|a€ A and bE Bl 
© Exponentiation A* o¢ AT B=|f| fis a function from B to Al 
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‘Then notions of addition, multiplication, and exponentiation for cardinals may be 
defined in terms of these auxiliary notions: 


+ Addivion \Al+ 1B) 
+ Muteplicacion [AQ 1B] 
2 Eipeennason WAV or ALT EB} =LAT Bb 





aon 





(For these latter definitions to make sense, it is first shown that if | Al|= 
[BI=[D], then [A+ B]=|C+ DI, |A-Bl=1C- DI, and [AT B]=ICT DI.) 
‘The notions thus defined agree with the usual notions for finite cardinals or natural 
‘numbers, Many of the basic laws that hold for finite cardinals can be shown to hold 
forall cardinals, 

Other such laws fail strikingly for transfinite cardinals. One of Cantor's more 
striking results was that Ny Ny =H (from which it follows also that Ny +Ny = No). 
‘To establish this, a bijection between pairs of natural numbers and single natural 
‘numbers is set up, whose pattern may be guessed from the following: 





(0.0) (1,0) (0,1) (1) (2.0) (2,1) (02) 02) (22) 
CHS ties Sees Saar Vina Tart Seat a Cae 


Cantor also established that 2 TN, =c, and in general 21 J Al) =[) (A). Tt then 
follows that Ny<c as a consequence of the celebrated result, known simply 
as Cantor's Theorem, that in. general [Al < | (A)|. This last was established by 
Cantor's celebrated diggonal argument: Supposing there is a bijection f between A 
and (A), consider the set D= (a € Ala fla)]. Since fis a surjection, there is 
some d€ D such that fld)=D. But then we have the contradiction that d€ D 
if and only if d € f(d) = D. As special cases or easy applications of some of the kinds 
Of theorems just cited, Cantor showed that there are no more rational than natural 
‘numbers, more real than rational numbers, but again no more complex than real 
‘numbers, 


3.4. Transfinite Ordinals 


Let now R and Sbe orders on sets A and B respectively. An isomorphism between R 
and S (or between ‘the set A equipped with the order R” and ‘the set B equipped 
with the order S*) is a bijection f from A to B with the further property that it is 
order preserving, meaning that if a, and a, are elements of A and b= fla) and 
b= fla) the associated elements of B, then we have a,Ra iff bySb,. An isomor- 
phism shows that R and Shave the same structure,’ differing only by the substitu- 
tion of elements of B for elements of A. A notion of isomorphism for operations 
is similarly defined. Isomorphism is indeed a central concept in modem, abstract 
‘mathematics. 

Cantor defined well-orders R and $ on sets A and B to have the same ordinal 
number |R| = |S| iff there exists an isomorphism between R and S. He defined p = 6 
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to mean there are well-orders R and S on sets A and B with |R|=p and |S|=0 
having R an initial sament of S. This last condition means intuitively that S looks 
like R with some additional clements added at the end, or more formally that the 
following hold: 


1 ACB 
2 Ifa, a € A, then 4,Sa, iff «Ray 
3 Ifa A and bE B-A, then aSb. 


‘The notion = has the properties of a well-order. (This is not obvious: the anti- 
symmetry property, that if p<o and a= then p=, and the connectedness 
property, that either p= @ or op, are fairly substantial results.) The natural 
‘numbers are also the finite erdinals, the ordinal of the usual order on the set 
(0, 1,..., m=] of the set of natural numbers less than a given natural number 
(to which every other order on this set is isomorphic) being identified with m itself 
Cantor called the ordinal of the natural numbers in their usual order ©. 

‘The least ordinal is 0. For every ordinal p, there isa least ordinal greater than it, 
the (immediate) succesor 9’ of p. It may be given an explicit description as follows: 








If p=|R] where R is a well-order on a set A, and B consists of A with one 
additional element 6, and Sis the well-order on B that is like the wellorder R 
con A but with the additional element 6 placed after every element of a, then 
p'=|S} 


For every set T of ordinals, there isa least ordinal tat least as great as all elements 
of T, called the supremum of T, written sup T It may also be given an explicit 
description, though it is a rather complicated one. An ordinal that is neither zero 
nor a successor is called a limit. The least limit ordinal is ©. 

‘Cantor also introduced an arithmetic of ordinal numbers. ‘The notions of addi- 
tion, multiplication, and exponentiation may be defined by the following recursion 
‘quations (though it is a substantial theorem that such equations do suffice to 
determine the values of sums, products, and powers, uniquely). 


P+O=0  pr(o')=(p+ay at limits p +t =suplp +o]a<t} 





p-0=0 p-(')=(p-0)+p at timits p-t =supip-6]}o< +t) 


pto=1 pto’ =(pta)-p atlimis pT r=pip Talo<x} 
‘Thas after the finite ordinals comes the ordinal @, after which come 


©2041, ©42, OF3,..., @F@=0-2, 241, 0-242)... 023.045 
o-4,..., 0.0012, of 2+1,...,072+@..., 0720-013, 
oT4,...,aTo, 








and so on. The notions thus defined agree with the usual notions for finite ordinals 
for natural numbers. Many of the basic laws that hold for natural numbers can be 
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shown to hold for all ondinals. Other such laws fil for transinite ordinals. Notably 
the commutative law fails, since for instance 1 +@=0# +1. 

‘The cardinal associated with an ordinal p=|R|, where R is a well-order on A, is 
the cardinal of the underlying set, | Al. Thus the cardinal associated with the ordinal 
© is Ny. In the infinite case, there are many ordinals associated with the same 
cardinal. In particular (as a consequence of the fact that Ny Ny = Ny +No=N) all 
the ordinals mentioned in the preceding paragraph are associated with the same 
cardinal Ng. In fact, Cantor showed that given a cardinal a having ordinals associ- 
ated with i, the cardinal number of the set of such ordinals is greater than a itself, 
Tr may be denoted a’, and then 5 is denoted X, and Nj is denoted X,, and so on, 
with supiX, | w= 0, 1,2,....] being denoted X.. There is, in fact, a whole series of 
larger and larger cardinals X,, one for each ordinal p, and a number of theorems 
established for Ny can be generalized to all these N, one such generalization being, 
that Ny Ny = Ry By = By 


3.5. The Axiom of Choice (AC) 


‘The arithmetic of cardinal numbers would be simplified if it could be assumed that 
every cardinal a comes somewhere in the series of alephs, so that a= X, for some 
‘ordinal , In particular, this would imply that the partial order = on cardinals is 
in fact a (total) order, and that all infinite cardinals satisfy a-a=a+a=a. The 
assumption that every cardinal is an aleph amounts to the assumption that for every 
set A there exists some wellorder R of A. Cantor, however, found himself unable to 
establish this Well-Ordering Principle (WO), o¢ even to establish that there exists 
some well-order of the set of real numbers. 

‘The first proof of WO was published by Zermelo, who based his argument on the 
Axiom of Choice (AC), according to which, for every equivalence, there exists a 
selector. This axiom had been implicitly used by Cantor and others, but was made 
explicit in Zermelo's work. While the derivation of WO from AC is a substantial 
result, the converse fact, that WO implies AC, is much easier to establish. Eor let A 
bbe a set and Ean equivalence on it. WO supplies a well-order R of A, from which 
wwe easily obtain the selector S= [al a is the Releast element of its E-equivalence 
class}, WO is only one of several principles known to be equivalent to AC. Another 
is the assertion that for any two cardinals @ and B, either a=B or B= a. Yet 
another is Zorn’s Lemma, widely used in mathematics, whose statement is too tech- 
nical to be given here. 

A distinctive feature of AC is that it asserts the existence of a certain set, a selector 
S fora given equivalence E on a given set A, without giving any condition specifying, 
the set, such as would allow it to be written as S= (x A|@(x)]. Largely owing to 
this feature, AC remained controversial for at least a couple of decades after its 
introduction, and, even today, mathematical textbooks sometimes flag those the- 
‘orems whose proofs depend on AC. 

Among many striking consequences of AC perhaps the best known is the Banach— 
Tarski theorem. This asserts that there exist two spheres Sand T of unequal radius in 
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three-dimensional Euclidean space, and a decomposition of each into a small number 
of disjoint pieces, $= A, U---U A,and T= By U--- U B,with A, A,and 8, B, 
empty for distinct and j, such that corresponding pieces A, and B, are geomettically 
congruent, meaning they can be made to coincide by rotating and translating one of 
them. 

For present purposes, a measure on subsets of n-dimensional Euclidean space may 
be defined as a function assigning such sets non-negative real numbers, and having 
the following three properties: 


1 Normality The measure of a simple set of the kind considered in elementary 
‘geometry is simply its volume (if m= 3) or area (if m= 2) or length (if m= 1) as 
defined in elementary geometry. 

2 Invariance The measures of geometrically congruent sets are equal 

3. Countable additivity The measure of the union of countably many disjoint sets 
is the sum of the measures of those sets 








‘The Banach-Tarski theorem shows that in dimension m= 3, there is no measure 
defined on all sets, even if one weakens the requirement of countable additivity to 
finite additivity. Actually, the axiom of choice implies that, in no dimension, is there 
a measure defined on all sets. Thus some of the consequences of the axiom are 
“negative” results 


3.6. The ‘Reduction’ of Mathematics to Set Theory 


‘The exposition of set theory to this point has made mention of several sorts of 
entities over and above sets: ordered pairs, natural and rational and real and complex 
numbers, transfinite cardinals and ordinals. If pure set theory, in which the only 
entities considered are sets, is to serve as a framework for the development of 
‘mathematics, set-theoretic surrogates must be found for these other sorts of entities, 
and ordered pairs and numbers of various kinds “identified” with these set-theoretic 
surrogates. The case of ordered pairs provides a paradigm of what goes on in such 
identifications. The ordered pair (a, 6) is defined to be {a}, (4, 61), and from this 
definition the basic law of ordered pairs, that (a, 6)=(c, d) iff a= cand b= d, is 
deduced. It is not pretended that this definition reveals what ordered pairs ‘really 
were all along,” What the definition and derivation of the basic law do show is that 
th posing of ordre pain subject this a awa eden ver an above ses 
a sense, superfluous. 

As for Cantor’ wrarfite number, on the Kdentlication generally adopted tay, 
cach ordinal is identical with the set of its predecessors. Thus zero, which has no 
predecessors, is the empty set, and: 0= {}, 1 = {0}, 2= [0,1], ...,@=(0,1,2,...), 
@+1=(0,1,2,..., @), and so on. Also, each cardinal is identified with the least 
ordinal associated with it, so that for instance Xy=0=N, the set of natural num- 
bers. As for other kinds of numbers, in the late nineteenth century these were 
already, in a process distinct from but related to the development of set theory, 
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being identified with certain (set-theoretic) constructions from natural numbers, and 
set theory merely takes over these identifications. 

Consider, to begin with, the non-negative rationals Q°. Intuitively, any pair of 
natural numbers (m, ) with m#0 represents a rational number m/m, with two 
such pairs (m,,) and (m,) representing the same rational number if and 
only if mm, = msm. Formally, one considers the set of pairs of naturals (m,n) 
with #0, shows that the condition m,-m,= 2, defines an equivalence rela- 
tion on such pais, and defines m/n to be the equivalence class of (m,m), and Q* 
to be the sct of all such equivalence classes, One then defines operations on non: 
‘negative rationals in terms of operations on natural numbers, so that m/n+ p/4= 
(meg pom) /n- pand (m/n)(p/4)=(m- p/n 4), and derives the basic laws of 
arithmetic for non-negative rationals from those for natural numbers. Order on 
non-negative rationals may also be defined in terms of order on natural numbers, so 
that m/n = p/9 if and only if m= 45 pm 

(Note tha, in principle, if the symbols + and - and = are used for addition and 
‘multiplication and order on natural numbers, they should not abo be used to 
ddenote addition and multiplication on the non-negative rationals; and if the symbols 
0, 1,2, .are used forthe natural numbers, they should not aso be used to denote 
the non-negative rational numbers 0/1, 1/1, 2/1y.... In practice, mathematicians 
ignore such distinctions, and no real confusion results from this ‘abuse of language") 

‘A particularly important construction, deriving from Dedckind’s modern adapta- 
tion of Eudoxus’ ancient theory of proportion, is that of the non-negative reals R° 
from the non-negative rationals Q". Intuitively, a non-negative real number a is 

determined by the set A of non-negative rational numbers fess than it, 
(In case = V2, for instance, A= (rE Q"] F< 2},) Such a set is non-empty, its 
complement is non-empty, it has no largest clement, and itis closed downwards in 
the sense that r€ A whenever r= sand s€ A. Ifa set with these properties is called 
4 ent, then, intuitively, not only does every non-negative real number determine a 
cut, but every cut determines a non-negative real number. Formally, one defines 
‘ot-negative real numbers simply to be these curs, One can then define sums and 
Products by 


A+ B= (rts r€ A and s€ Bl 
and 
A Be |t]t<r-s for some r A and 5 B) 
and order by 
ASBWWACB 
and the basic laws of addition and multiplication and order for non-negative reals 
will follow from those for non-negative rationals. 


‘Negative numbers can then be introduced by another simple construction. (Alterna 
tively, they could have been introduced earlier, proceeding directly from the natural 
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‘numbers N to the signed integers Z and thence to the rationals @ and reals R.) So 
can the complex numbers € and the numerous and varied other structures consid- 
cred in modem mathematics. In particular, the points of the m dimensional Euclidean 
space E, are simply m-tuples of real numbers (x... %.)- 


3.7. Axiomatic Set Theory and the Iterative Conception of Set 


Ic cannot be assumed that every condition ©(x) determines a set, and, in particular, 
this cannot be assumed for the condition x x, since if A= |x| x x] there is the 
contradiction that A € A iff A € A. This contradiction, known as the Rastell para~ 
dax, is only the most easily stated of several related set-theoretic paradoxes that 
became widely known in the years immediately after 1900, (Another is the Burali- 
Forti paradax, which results if itis assumed that there is a set of all ordinals. The 
‘order = on ordinals would then be a well-order of this set, and its order type would 
be the largest ordinal. But as has already been mentioned, there is no largest ordinal.) 
Cantor himself had been aware of some of these paradoxes, and in his correspon: 
dence had stated informal principles as to when a condition may be assumed (0 
determine a set, but he never published this material. In the wake of the para- 
doxes, Zermelo published an axiom system for set theory, including rigorously 
stated assumptions about when a condition @(x) does determine a set. Fraenkel and 
others proposed amendments, and the axiom system as amended is known as ZF. 
Adding AC to ZF produces ZEC, today the most widely accepted axiomatic system 
for set theory 

‘The axioms of ZF are as follows. First, there is extensionality, already stated at 
the beginning of this chapter, according to which sets with the same elements are 
identical. Then there are the principles of separation, guarantecing the existence 
of [x | ®(x)| for any set and any condition ©, and of replacement, guarantee- 
ing the existence of (y(x)|*€ 1) for any set # and any condition ¥ such that 
for each x there exists a unique y=y(x) for which (x,y) holds. (In a fully 
rigorous treatment, the vague notion of ‘condition’ would be replaced by a precise 
notion of ‘formula,” and separation and replacement would not be single axioms 
‘but rather axiom schemes, oF infinite lists of axioms, one for each relevant formula 
© or ¥.) 

In addition, there are existence axioms of pairing, union, powers, and infinity, 
asserting the existence of [a, 6 for any a and b, of UX for any X, of so(U) for any 
U, and of the set of natural numbers. These axioms are sufficient to establish the 
existence of all the other sets introduced in the developments outlined so far, 
though occasionally (as in the case of the product A - B) the existence proofs are not 
trivial. These axioms are also more than sufficient for the development of modern, 
abstract mathematics, except for the uses in certain areas of higher mathematics of 
the axiom of choice AC, already mentioned. 

‘There is an intuitive picture, often called the iterative conception of set, associated 
with the axioms, according to which sets lie in a hierarchy of levels, where the 
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clements of a set at one level must come from levels below it. (Thus no set is an 
clement of itself, there are no two-clement loops with x, © x, and x, x, there are 
‘no three-element loops with x,€ x; and x €.x; and x, €.%, and so on; and, 
‘moreover, there are no infinite descending chains with x, € x, x; € x1, x € x2, and 
80 on.) Since, in pure set theory, the only objects considered are sets, at the bottom 
‘of the hierarchy, there is just the empty set O= ||. If x is a set, then |x} always lies 
at a level immediately above that of x. Thus 1 = [0] lies at the next-to-bottom level, 
2=(0,1) lies at the next-to-next-to-bottom level, and N= (0,1, 2,...| lies at an 
infinite level immediately above all finite levels, |N} lies above that, and 50 on; and 
there is no top level. The hicrarchy is supposed to be as tall (to have as many levels) 
and to be as wide (to have as many sets at cach level) as possible, an intuition partly 
expressed by the various existence axioms 

In more formal terms Vip), the pth level of the cumulative lierarchy of sets, is 
defined for all ordinals p by these recursion equations: 





MOp= (1 
Vip)=Vip)U PLV(p)) at limits V(t) =U [V(9) |<} 


‘The rank of a set xis the least p such x € V(p + 1). The assumption of the iterative 
conception that all sts belong to the cumulative hierarchy is equivalent to several 
‘other assumptions, of which the most simply stated is the axiom of foundation oF 
regularity, according to which if a set x has any clements at all, then it has an 
‘element y that has no elements in common with it. (The equivalence of the different 
formulations of the assumption is non-trivial.) This is the last axiom of ZF. 

A variant ZU (or ZF mit Urelementen) allows a bottom level of ‘individuals’ (or 
Urelemente), which have no clements but may be clements of sets of any level. A 
different variant NBG (deriving from von Neumann, Bernays, and Godel) allows a 
top level of ‘classes,’ which may have sets of any level as elements but are not 
themselves sets or elements of anything. Yet other variants drop foundation or 
regularity and allow loops and chains, or depart from ZFC and its underlying 
intuitive picture in more radical ways. Detailed consideration of such variants is 
beyond the scope of this chapter. 


3.8, Descriptive Set Theory 


Considered as 4 subject in its own right, set theory has several branches. Cantor's 
‘earliest work was concemed specifically with sets of points in Euclidean spaces, and 
there is a substantial branch of mathematics, called descriptive st theory, concerned, 
‘even more specifically with those sets of points in Euclidean spaces that can be 
‘generated by comparatively simple processes. 

‘The basic sets in n-dimensional Euclidean space are those of the form ((x;, 
4, <x) <j and...and a, <x, <b.) for some real numbers a, and b, 








65 





John P. Burgess 


these are open intervals; if m=2, open rectangles; and if n=3, open rectangular 
parallelepipeds. ) 


+ An open set is one that is a union of countably many basic sets. 
* A closed set is one that is the complement of an open set. 

* An F, set is one that is a union of countably many closed sets. 
* AGy set is one that is the complement of an F set. 


‘The famous ‘fractal’ sets depicted in so many computer drawings are sets of this level 
of complexity, 


# The projection of an (n-+1)-dimensional set A to dimension 
A= [(x..-+ 5%) [fOr some Y (359. +-5 An IE Al. 

An analytic set is one that is a projection of a G, set. 

+A coranalyic et is one that is the complement of an analytic st. 

* A PCA set is one that is a projection of a co-analytic set. 


the set 





Repeatedly taking projections and complements in this way one obtains the pro- 
Jective sets, Though this i, in many senses, a large family of sets, it can be shown 
that the number of projective sets is only c, whereas the number of arbitrary subsets 
of Euclidean space is 2 T ¢. Descriptive set theory, or the theory of projective sets, 
is perhaps the branch of set theory closest to other branches of mathematics (such 
as real analysis and probability theory). To give at least one example of the kind of 
question considered in this branch of set theory, recall that it is impossible to define 
4 notion of measure forall subsets of Euclidean space. Nonetheless, Henri Lebesgue 
succeeded in defining a notion of measure applicable to a very large family of sets, 
which descriptive st theorists have shown to include all analytic and co-analytic sets. 
Whether all PCA sets, or even perhaps all projective sets, are Lebesgue measurable 
are open questions of descriptive set theory. The hypothesis that PCA (respectively: 
projective) sets are all Lebesgue measurable may be called PCAM (respectively: PM), 





3.9. The Topology of the Real Line 


Another branch of set theory, closely connected with the branch of mathematics 
Known as general topology, is concemed with arbitrary (rather than just projective) 
sets of real numbers, or what comes to the same thing, of points on the line, Here 
there are some positive results going back to Cantor himself. 

One such result (proved by a celebrated method called the back-and-forth argu- 
‘ment) states that if R is an order om a set A, then it will be isomorphic to the usual 
‘order on the rational numbers if and only if it has these three properties: 


1 Absence of extrema ‘There is no least and no greatest clement in the R-order. 
2 Density Whenever ae there exists a bE A with aRb and bRe. 
3 Countability ‘The underlying set A is denumerable. 
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Another such result states tha if R is an order on a set A, then it will be isomorphic 
to the usual order on the real numbers if it has the following four properties: 


1 Absence of extrema, 

2. Density. 

3 Completeness Whenever a non-empty subset BG A has an upper bound, an cle- 
ment « € A such that 6Ra for all 6 B, then it has a lease upper bound, an upper 
bound ¢ such that for any other upper bound d we have cRd. 

4 Scparability "There is a countable subset BC A such that whenever 4, ¢€ A 
and aRe there is a BE B such that aRb and bRe. 


‘The Suslin hypothesis SH is the conjecture that in the last-stated theorem the last 
assumption (4) can be weakened to the following: 


4 Countable chain condition There is 90 uncountable set of disjoint, non-empty 
‘open intervals. 


‘The satus of SH is an open question of the topology of the real line. 

However, the most important open question about arbitrary sets of real numbers 
6 points of the line is whether such a set can have a cardinal number intermedi- 
ate between Ny and €. The conjecture, made by Cantor himself, that there are no 
such intermediate cardinals, which amounts to the conjecture that 2 Ty =, is 
known as the continuum lypothesis CH. Rival possibilities that have been considered 
are the proposition CH* that there is just one intermediate cardinal, so that 
27 Ny= Ns, and the proposition CH! that the number of intermediate cardinals is 
€ itself. (A few alternatives can be ruled out, For instance, it can be proved that 
IT RAR) 


3.10. Infinitary Combinatorics 


Just as, asociated with the ordinary arithmetic of the natural numbers, there is a 
sophisticated theory of finite combinatorics, concerned with counting permutations, 
combinations, and the like, so also with transfinite arithmetic there is associated an. 
infinitary combinatorics. To give one example of a postive result in this branch of 
set theory, there is Ramseys theorem, which in its simplest form says that if Fis a set 
of cardinal X, and the set 1"! of unordered pairs of distinct elements of Is partitioned 
into two disjoint pieces, 1°! = A U B, then there is a subset JC [also of cardinal Ny 
such that all the pairs from J belong to the same picce, that is, either JC A or 
ACB 

‘There are many open questions in this branch of set theory, and again the most 
important such question is the most basic one, the question of the status of the 
generalized continuum iypotbess GCH, according to which 27%, =N,. for all 
ordinals p. (CH is thus the special case p= 0 of GCH.) 
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3.11. Independence Results: Constructible Sets and Forcing 


According to Godel’s first incompleteness theorem {see chapter 4], for any formal 
theory or axiom system that is consistent and strong enough to develop a modicum 
of mathematics, there will be statements that can be formulated in the language of 
the theory that cannot be decided one way or the other within the system. So, in 
particular, assuming ZFC is consistent, there must be a statement A in the language 
of ZEC such that is independent of ZEC in the sense that neither A nor its negation 
~A ean be proved in ZFC, or to put the matter another way, such that both the 
result ZEC + ~A of adding ~A to ZFC and the result ZFC +A of adding A to 
ZEC are consistent. Indeed, by Géde!’s second incompleteness theorem, the state- 
ment Con(ZFC) expressing formally the assertion that ZFC is consistent is such an 
independent statement A. Note that, by this last result, if ZFC is consistent, we 
cannot hope to prove absolutely or unconditionally in ZEC itself that ZFC + A or 
‘ZEC + ~Ais consistent, The most one can hope to prove is relative consistency, the 
conditional assertion that if ZFC is consistent, then so are ZEC + A and ZFC-+~A. 
‘These general results of Godel’s, however, leave entirely open whether there are any 
specific set-theoretic questions actually raised in the mathematical literature that are 
thus independent of ZEC. 

In fact, there are many such statements. Godel himself, in the 1930s, showed that 
GCH (and hence CH) is consistent relative to ZFC (and at the same time that AC 
is consistent relative to ZF). His method of iamer modes was to define a special class 
of sets, the constructible sets, containing all ordinals and just enough other sets to 
make it provable in ZF for €ach of the axioms of ZF that it comes out true when 
‘set’ is interpreted to mean ‘constructible set.” He also proved in ZF that the axiom 
af construcibiity (‘V = L), according to which the constructible sets are in fact all the 
sets there are, itself comes out true when ‘set’ is interpreted to mean ‘constructible 
set, and in this way showed V= L to be consistent relative to ZF. He also proved 
V=L implies (AC and) GCH (and hence CH). It is now known that V=L also 
implies ~PCAM (a result foreseen by Godel and worked out in detail by J. W. 
‘Addison) and ~SH (a result of Ronald Jensen, who has derived innumerable other 
unexpected consequences from V=), which is sufficient to establish the relative 
consistency of these statements as well. 

Cohen, in the 1960s, showed that ~CH (and hence ~GCH) is consistent relative 
to ZEC (and at the same time that ~AC is consistent relative to ZF), introducing for 
this purpose his celebrated method of forcing (which he combined with the method 
of inner models for the proof about ~AC). Soon elaborate refinements of that 
method were developed by other set theorists, notably Solovay and Martin. Together 
the two last-named workers applied these methods to prove the relative consistency 
ofa technical statement now known as Martin’s axiom (MA), or rather, of the con- 
junction of MA and ~CH. This conjunction implies, among many other things, SH 
(whose relative consistency had been proved by Solovay and Tenenbaum) and 
PCAM (whose relative consistency also admits of other proofs). But the examples 
‘mentioned are but a tiny fraction of the statements whose independence has been 
established in the wake of the work of Gadel, Cohen, and their immediate successor. 
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3.12. Large Cardinals and Determinacy 


‘The fact that the currently accepted axioms do not suffice to settle many basic 
questions of set theory has motivated the consideration of new axioms. One large 
class of candidates is constituted by the large cardinal axioms, cach of which asserts 
the existence of some cardinal X, that, 30 t0 speak, towers overall smaller cardinals 
X, for 6 <p in some of the ways in which Ng towers over the finite cardinals, with 
different axioms of this class focusing, on different such ways. 

‘One way in which X, exceeds all smaller cardinals is by having the following two 
Properties: 


1 Regularity Any union of fewer than Ny sets cach of cardinal les than Ry itself 
has carnal less than Ny. 
2. Serong limit If is a cardinal less than N,, then 2 T « is still less than Ne. 


A cardinal, with p > 0 having the analogous properties would be called inaccessible. 
‘The hypothesis of the existence of such a cardinal may be called IC. The least such 
cardinal would be far larger than the cardinal of any set considered in mainstream 
‘mathematics, 

‘As to the status of the hypothesis IC, the following may be said. In ZFC each of 
the axioms of ZFC can be proved to come out true if ‘set’ is interpreted to mean 
‘set in V(q)’ or set of rank <a,” where a is an inaccessible cardinal, It follows that 
in ZEC +1C one can prove Con(ZFC), from which it follows by Godel’s second 
incompleteness theorem that if ZEC + IC is consistent, then in ZFC + IC one can- 
‘not prove the conditional “if Con(ZEC) then Con(ZFC + IC)” In other words, one 
cannot hope to prove even the relative consistency of IC (though the relative 
consistency of ~IC is comparatively easily proved). In jargon, ZFC +1C is said to 
have greater consistency strength than ZE. 

‘Another way in which Ny exceeds all smaller cardinals is expressed in Ramsey's 
theorem as stated earlier. A cardinal X, with p> 0 having the analogous property 
‘would be called weably compact. The hypothesis of the existence of such a cardinal 
‘may be called WCC. The least such cardinal would be far larger than the least 
inaccessible cardinal: indeed, if a is weakly compact, the number of inaccessible 
«cardinals less than «tis itself @; and the hypothesis WCC is of much greater consis- 
tency strength than the hypothesis IC. 

Many large cardinal axioms of still greater consistency strength have been con- 
sidered in the literature under various exotic names, such as the hypothesis of the 
existence of supercompact cardinals, or SCC. In general, their definitions are t00 
technical to be given here. It has often been argued that large cardinal axioms may 
bbe thought of as further expressions of the intuitive idea, included in the iterative 
conception of set, that the cumulative hierarchy is as “tall” as possible; though 
admittedly the argument becomes less compelling a8 the large cardinals grow larger. 

‘A completely different clas of axioms, involving the notion of infinite games, has 
been considered by descriptive set theorists. Given a set A of real numbers, say 
between zero and one, a game for two players may be as follows, 
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Player I picks the first digit in the decimal expansion of a real mumber. 
Player II then picks the second digit. 

Player I then picks the third digit. 

layer II then picks the fourth digit, and so on. 


In this way (though infinitely many moves are required), a real number a is generated. 
Player I wins if a € A, and else player IT wins. 


‘© A strategy for a player is a rule telling that player what digit to choose at cach 
step, as a function of the opposing player’s choices at previous steps. 

‘+ A.winning strategy for player 1 is one such that if player I follows it, then player 
[will win regardless of what player II docs; and the notion of a winning strategy 
for player IL is similarly defined. Obviously, there cannot exist winning strategies 
for both players. 

+ The set A is called determinate if there exists a winning strategy for one or the 
‘other of the players. 


‘The axiom of determinacy AD asserts that every set of real numbers is determinate 
AD itself can be disproved in ZEC, though the disproof does make use of AC, More 
restricted axioms in the same direction have therefore been considered, notably the 
axiom PD of projective determinacy, according to which all projective sets are deter- 
minate. This axiom has many, many ‘positive’ consequences in descriptive set theory, 
notably PM (and hence PCAM), and, indeed, these consequences are the main, 
motive for interest with the axiom, whose bare statement no one pretends to find 
especially compelling. 

During the 1970s and 1980s, PM was intensively investigated by a group of 
logicians including Moschovakis, Kechris, Martin (already mentioned above), Steel, 
and Woodin. ‘Their work culminated in proofs by the three last-named of this group 
that the assumption of sufficiently large large cardinals ~ inaccessible or weakly 
compact cardinals are not nearly large enough, but cardinals in the neighborhood of 
supercompact suffice ~ implies much determinacy, including PD; and conversely, 
though such determinacy hypotheses do not literally imply the existence of large 
large cardinals they atleast imply their consistency. Thus a connection is established 
between two classes of axioms having entirely different motivations, arguably con- 
siderably increasing the strength of the motivation for each. 

To date, however, no axiom such as would settle CH (let alone GCH) has 
gamered nearly as much support as large cardinal or determinacy axioms, and it can 
pethaps hardly be said of any the three rival hypothesis CH, CH*, and CH* that it 
is substantially more widely accepted than the others. In response to this situation, 
various philosophical positions have been taken. Thus plazonism (usually written 
with a small ‘p* to acknowledge the tenuousness of its connection with Plato) 
maintains that there is a fact of the matter as to the size of ¢, whether or not set 
theorists are able to discover that fact, and that if they wish to do so, set theorists 
will simply have to work barder. But formalism maintains that the acceptance of one 
rather than another of the vatious equally consistent rival hypotheses about ¢€ would 
be a matter of pure convention or stipulation. (Thus platonism and formalism are 
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oughly speaking the analogues in philosophy of mathematics of legal realism and 
[egal positivism in philosophy of law.) There are (as in the analogous case of phils: 
‘ophy of law) both substantial reasons for dissatisfaction with each of the two extreme 
positions, but also substantial difficulties involved in any attempt to articulate a clear 
intermediate position. 


Suggested further reading 


For an clementary introduction to set theory, especialy in its role as a framework for the rest 
‘of mathematics, see the perennially popular textbook Halmos (1960), For a more detailed 
‘exposition, see Barwise (1977), a standard reference containing surveys on a generous sale, 
mostly by distinguished contemporary set theorists, of the diferent branches of the subject, 
‘with attribution of results to their original authors and references to the original technical 
literature: While in mathematics, a in other sciences, historical sources tend to be superseded 
‘quickly by later treatments, the original writings of the pioneers made available in English 
‘randation (and with substantial intoductory essays by present day scholars) in van Heijenoort 
(1967) remain of considerable interest. Philosophical problems about the nature of the 
‘reduction’ of mathematics to set theory are raised in the classic paper Benacerrat (1968). The 
search for new axioms for set theory is surveyed from a sympathetic philosophical perspective 
in Maddy (1988), with special attention to the advances of the 1980s, 
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Chapter 4 


Gédel’s Incompleteness 
Theorems 


Raymond Smullyan 


4.1. Introduction 


At the turn of the century, there appeared two comprehensive mathematical systems, 
‘which were indeed so vast that it was taken for granted that all mathematics could 
be decided on the basis of them. However, in 1931, Kurt Godel (1992 [1964] 
surprised the entire mathematical world with his epoch-making paper which begins 
with the following startling words: 


“The development of mathematics in the direction of greater precision has led to large 
areas of it being formalized, so that proofs can be carried out according to a few 
mechanical rules. The most comprehensive formal systems to date are, on the one 
hhand, the Principia Mathematica of Whitehead and Russell, and, on the other hand the 
‘Zermelo-Fraenkel system of axiomatic set theory. Both systems are so extensive that all 
‘methods of proof used in mathematics today can be formalized in them = ic. can be 
reduced to a few axioms and rules of inference. It would seem reasonable, therefore, t0 
surmise that these axioms and rules of inference are sufficient to decide all mathemat 
ical questions which can be formulated in the system concerned. In what follows it will 
bbe shown that this is not the case, but rather that, in both of the cited systems, there 
cxit relatively simple problems of the theory of ontinary whole numbers which cannot 
bbe decided om the basis of the axioms. 


Gédel then went on to explain that the situation did not depend on the special 
nature of the two systems under consideration but held for an extensive class of 
‘mathematical systems. (More on this ‘extensive’ class is explained later on.) 

How did Godel manage to find a sentence which, though truc, is not provable in 
the system under consideration? Roughly speaking, what Godel did was to assign to 
cach sentence of the system a positive whole number, subsequently known as the 
Gédel number of the sentence, and then he very ingeniously constructed a sentence 
G that asserted that a certain number was the Godel number of a sentence that 
‘was not provable in the system, but the number » was the Godel number of the very 
sentence G! Thus G asserted that its own Godel number was the Gédel number of 
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an unprovable sentence (unprovable in the system, that is), which is tantamount to 
saying that G aserts that G is not provable in the system. Thus if G is true, then, as 
it asserts, it is not provable in the system; whereas if Gis false, then what it asserts is 
‘not the case, which would mean that it is provable in the system. Thus, one of two 
alternatives holds: 


1 Gis true but not provable in the system. 
2. Gis false but provable in the system. 


‘Alternative (2) was really out of the question, since the system was obviously set up 
‘0 that only srue sentences were provable, hence it must be alternative (1) that 
holds, and hence G is true, but not provable in the system. 

once gave the following somewhat humorous illustration of this, Consider the 
following sentence 


‘This sentence can never be proved. 


Is there a paradox here? Well, it seems so, because ifthe sentence is false ~ if it is 
false that it can never be proved ~ then it can be proved, which means it must be 
true, So if it is false, it must also be true, which is impossible, Therefore, it cannot be 
false; it must be true. Now, I have just proved that the sentence is tru. Since itis 
true, then what it say is really the case, which means that it can never be proved. So, 
how come I have just proved it? 

What is the fallacy in the above reasoning? The fallacy is that the notion of 
provable is not well defined. One important purpose of the field known as “Math- 
‘ematical Logic” is to make the notion of preef a precise one, However, there has not 
been given a fully rigorous notion of proof in any absolute sense; one speaks rather 
‘of provabilty within a given nstem. Now, suppose there is a system ~ call it system 
‘Sin which the notion of provabilty within the system S is clearly defined. Suppose 
also that the system S is correct in the sense that everything provable in the system 
is really true, Now consider the following sentence: 


‘This sentence is not provable in system S. 


‘The paradox disappears! Instead of a paradox, there is now an interesting truth ~ 
namely, that the above sentence must be a true sentence which is not provable in 
system S, because if it were false, then unlike what it says, it wowid be provable in 
system S, contrary to the given fact that Sis a correc system which never proves any 
false sentences. Thus, the sentence is true and hence also not provable in system S. 
‘This sentence is a crude formulation of Godel's famous sentence G discussed earlier. 
‘Some more refined formulations are discussed later on 

‘Now consider another heuristic illustration: In a certain land, every inhabitant is 
classified as either Type T or Type F. All statements made by those of Type T are 
‘true, whereas all statements made by those of Type F are false (hence the letters ‘T” 
and “F'). One day, logician visited this land. Now, this logician was completely 
accurate in all his proofs; he never proved anything that was not true. He came 
across an inhabitant of this land called “Jal” Well, Jal made a certain statement, from 
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which it logically follows that Jal must be of Type T, but the logician could never 
possibly prove that he is of Type T! What statement could accomplish this? (Can the 
reader solve this before reading further?) 

‘One solution is that Jal said: “You can never prove that I am of Type T.” Could 
Jal be of Type F? Well, if he were, then his statement would be false, which would 
‘mean that the logician could prove that Jal is of Type T, contrary to the given 
condition that the logician never proves anything false. Thus Jal cannot be of Type 
F; he must be of Type T. Since he is of Type T, his statement was truc which means 
that the fogician can never prove that he is of Type T. Thus Jal really is of Type T, 
bur the logician can never prove that he is. 

Now, to carry the matter a bit further, suppose that, in addition, the logician 
knows logic as well as you and I. Well, having just proved that Jal must be of Type 
T, what isto prevent the logician from going through the same reasoning and hence 
coming up with the conclusion that Jal is of Type T? He would thus prove that Jal 
is of Type T, which would falsify Jal’s statement, thus making Jal in reality to be of 
‘Type F! Thus the logician would prove that Jal is of Type T, whereas Jal would 
really be of Type F, contrary to the given condition that the logician never proved 
anything false, Is this not therefore a paradox? No, not really. (Can the reader see 
why without reading further?) 

‘The only way for there not to be a paradox is if the logician does not know 
everything that you and I know. We are told that the logician knows logic as well as 
you and I, but could there be something else that we know but which he does not? 
Yes, there is! I told you that the logician was always accurate in his proofs, but I 
never told you that he knew or could prove that he was accurate! Indeed, if we 
fetract the assumption that he is always accurate, then if he could prove that he is 
always accurate, he would lapse into an inaccuracy as follows: He would reason: 


Suppose Jali of Type F, then his statement i fils, which means that I comld prove he 
is of Type T, and hence T would be inaccurate. Now, I am never inacccurate (se); 
hence he cannot be of Type F, he must be of Type T. 


‘Thus, the logician has proved that Jal is of Type , thus falsifying Jal's statement, 
and so, Jal must really be of Type F. Thus, by assuming his own accuracy, the 
logician has lapsed into an inaccuracy (and was punished for his own conceit!). 

‘This is closely related to the result known as Godel’s Second Incompletcness 
‘Theorem, which is that for any of the mathematical systems under consideration, if 
it could prove its own consistency, it would be inconsistent, and hence if the system 
is consistent (which it certainly seemed to be), then it could never prove its own 
consistency. I will say more about Gédel’s second theorem later on, but for now, 
here is an inkling as to how Godel managed to construct a sentence that was “self: 
referential” im that it asserted its own non-provability. 


41.1. A Gidelian machine 


Here is a computing machine that illustrates Gadelian self-reference in a very in- 
structive way. It prints out various sentences built from five symbols: 
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Informally the symbol N stands for not; the symbol P stands for princability 
(by the machine); and the parentheses are used for naming expressions. For any 
combination X of the five symbols, by its name is meant the expression (X) ~ie., 
X enclosed in parentheses. For example, the name of PNDP is (PNDP). By the 
diagonalization of an expression X is meant X(X) i., the expression followed by 
its own name (X). For example the diagonalization of PDNP is PNDP(PNDP). 
The symbol D stands for diagonalization. 

By a sentence is meant any combination of the five symbols that is of one of four 
forms (where X is any combination of those symbols whatsoever): 


P(X) for example, PUNDNP) 
NP\X) for example, NP/DDN) 
PD(X) for example, PD(NDP) 
NPD(X) for example, NPD(PNN) 


1 will explain what the sentences mean in a moment. An expression is called print- 
‘able if the machine can print it. The machine is so programmed that anything it cam 
print, it will print sooner or later. 

Now here is what the sentences mean: 


P(X) means that X is printable and is, accordingly, called rue iff (if and only if) 
X is printable, (For example, P(NDNP) is true if NDNPis printable and false if 
NDNPis not printable.) 


NP(X) is called true if itis not the case that X is printable ~ in other words that 
X is not printable. 


PD(X) means that the diggonalication of X is printable. Thus, PD(X) is trwe iff 
X(X) is printable, 


NPD(X) means the opposite of PD( X) —in other words that X(X) is wet printable, 


This constitutes a perfectly precise definition of what it means for a sentence to be 
true, We have here an interesting loop: The machine is self-referential in that it 
prints out various sentences that assert what the machine can and cannot print. 
(Such machines are of interest to the field known as artificial intelligence.) Now, the 
machine is totaly accurate, in that every sentence it prints is true; it never prints 
anything false! This bas several ramifications: For any expression X, if the machine 
«an print P(X), then P(X) must be tnie, hence X will be printed sooner or later. If 
PD(X) is printable, then X(X) will be printed sooner or later. If NPUX) i printable, 
then X itself will never be printed. 

Now, suppose X is printable, docs it necessarily follow that P(X) is printable? 
Well, X being printable makes P(X) true; however, we are not given that all true 
sentences are printable, but only that no false sentences are printable, and so we 
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have no reason to believe that P{X) is printable on the mere basis that P(X) is true, 
‘As a matter of fact, there és a true sentence that is definitely not printable, and the 
‘reader is invited to try to construct such a sentence before reading the solution. (Hint: 
Construct a sentence that asserts its own non-printability — just like the inhabitant of 
‘Type T who said that the logician could never prove him to be of Type T.) 

Here is the solution: A sentence that works is NPD(NPD) Why? Well, for any 
expression X, the sentence NPD(X) asserts that the diagonalization X(X) of X is 
not printable, In particular, taking NPD for X, the sentence NPD(NPD) asserts that 
the diagonalization of NPD is not printable. But the diagonalization of NPD is the 
very sentence NPDX NPD)! Thus, NPD(NPD) asserts its own non-printabilty, If 
the sentence were false, then what it asserts would mat be the case, which would 
mean that the sentence is printable, contrary to the given condition that the machine 
fs accurate and never prints any false sentences! Therefore, the sentence cannot be 
false; it must be true, and hence must be non-printable, as it asserts. Thus, NPD{NPD) 
is true but the machine cannot print it. 


Remarks ‘The above is a Gédel-type argument reduced to about a bare minimum, 
Obviously, no accurate machine can possibly print a sentence that says that the 
machine cannot print it! (The situation is reminiscent of the scene in Romeo and 
Juliet in which the nurse comes running to Juliet and says, “Do you not see that 
am out of breath?” Julict replies, “How art thou out of breath, when thou has 
breath to say to me that thou art out of breath?”) 

‘An interesting thing about the above machine language is that if Sis any of the 
‘mathematical systems subject to Godel’s argument, itis possible to translate any 
sentence X of the machine language to a sentence such that X is true iff its transla: 
tion is a true sentence of the system S, and also X is printable by the machine iff its 
translation is provable in S. It then follows that the translation of the sentence 
NPD(NPD) must be a true sentence of $ which is not provable in S. 





4.2. Incompleteness in a General Setting 


‘Now to the heart of the matter: What are the features of those mathematical systems 
subject to Godel’s (and also later) incompleteness arguments? Well, the systems in 
question have the following features: First, there is a well-defined infinite set of 
expressions, some of which are called sentences, some of which are called predicases. 
Informally, sentences express propesitions, which are cither true or false, whereas 
predicates act as names of properties oF sets of (whole) numbers. (There may be 
‘expressions that have other functions as well, but these are not of concem here.) To 
every expresion X and every number ,is associated an expression denoted by X{7] 
and that for every predicate H, the expression H[n] is a sentence. Informally, H[] 
expresses the proposition that » belongs to the set named by HE Let every expres- 
sion be assigned a number called its Gide! number, and for any expressions X and Y, 
define X(Y) to mean X[y] where y is the Gidel number of Y. (In this manner, the 
necessity of constantly referring to Gadel numbers is circumvented and thus the 





76 


Godel’s Incompleteness Theorems 


‘expressions themselves can be dealt with more directly.) In particular, for any 
predicate H and any expression X, the expression H(X) isa sentence that informally 
asserts that the set named by H contains the Gide! number of X. 

To each pair of numbers x and yis associated a number denoted by x y such that 
if x is the Godel number of an expression X and y is the Godel number of an 
‘expression ¥, then x+ y is the Godel number of the expression X(Y). 

Now, to the important notion of diagonalizers. For each predicate H is associated 
a predicate H" called the diagonalizer of H. Informally, for any number n, the sen 
tence H*[n] expresses the proposition that m 1 lies in the set named by H. Thus 
H"[n) expresses the same proposition as the sentence H[ + n}. Also, therefore, for 
any expression X, the sentence H'(X) expresses the same proposition as H(X(X)) 

To each expression X is associated an expression X (more often written ~X) 
called the negation of X, such that if X is a sentence, so is X (which informally 
asserts that X is not true) and also, if His a predicate, so is FZ, and for any number 
1, the sentence Hn] is the negation of H[n] (thus An] = H[#]) and so also for 
any expression X, the sentence H(X) is the negation of HUX). Informally, H[] 
asserts that m belongs to the set named by H, hence H[m] asserts that » does not 
belong to the set named by H. 

In each of the systems under consideration, there is a well-defined notion of 
provabilty (within the system). A sentence Sis called refutable (in the system) if 
negation $ is provable in the system. The systems under consideration use what 
known as classical logic, one aspect of which is that for any sentence S, its double 
negation 5 is provable iff S itself is provable (reminiscent of the adage, ‘Two nega- 
tives make a positive’). Therefore, $ is refutable iff Sis provable. Thus there is the 
symmetry: S is refutable iff $is provable, and also, Sis provable iff $ is refutable 

‘The system is called comsitent if no sentence is both provable and refutable; 
otherwise, it is called inconsistent. The underlying logic of the system is such that if 
so muuch as ome sentence is both provable and refutabl, then every sentence becomes 
provable, and thus the whole system collapses! 

‘The system is called complet if every sentence is ether provable or refutable, other- 
wise, incomplete. AS indicated earlier, before Godel’s discovery, it was erroneously taken 
for granted that the two main mathematical systems of the time were complete 
‘Godel showed (under a certain very reasonable assumption, which is explained later) 
that these systems ~ as well as an important variety of others ~ were incomplete. 

Now, turning to some general incompleteness arguments: The first incomplete 
ness argument to be considered is a slightly later variant of Godel’s original argu 
‘ment, but it is given here first, since it is the simplest. It employs the notion of truth, 
which was not formalized until 1936 by Tarski (1956). 

Each sentence of the system is classified as cither true or fale and in the systems 
under consideration, the following three conditions hold: 








Ty: Foreach sentence S,cither Sis true and Sis false, or Sis false and Sis true. 


Ty: For each predicate H and every number m, the sentence H'[] is true iff 
Hn n] is true (and thus for every expression X, the sentence H*(X) is 
true iff HY X(X)) is true). 
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‘Tx: There isa predicate P such that for every expression X, the sentence P(X) 
is true iff X is a provable sentence of the system. 


Under the assumption that the system is correct, in that no false sentences are 
provable, it logically follows from conditions T,, T; and T, that there must be a 
true sentence which is not provable in the system, and hence also that the system 
‘must be incomplete. Here is why: 

Let Qbe the predicate P* (the diagonalization of the negation of the predicate P). 
‘Then, for any expression X, the sentence Q(X) must be true iff X(X) is mot a 
provable sentence (because by T;, P*(X) is true iff PUX(X)) is true, which by 7, is, 
in turn, the case iff P(X(X)) is not true (since PUX(X)) is the negation of P(X(X))), 
which by T, is, in turn, the case iff XUX) is not provable in the system). Thus, either 
Q(Q) is true but not provable in the system, or Q(Q) is false but provable. The 
latter altemative is ruled out by the assumption that the system is correct. Thus, 
Q(Q) is true but not provable in the system. It further follows that the negation of 
Q() is false hence also not provable (by the assumption of correctness), thus Q(Q) 
is neither provable nor refutable in the system, and so the system is incomplete 


Note In cach of the systems studied by Godel, there is also a predicate R such that 
for any expression X, the sentence R(X) is true iff X is a refutable sentence of the 
system. As pointed out in Smullyan (1961), this leads to a variant of the above 
argument which isa bit simpler: Let K be the predicate R*, Then for any expression 
X, the sentence K(X) is true iff XUX) is a refwwable sentence of the system (because 
RYX) is true iff R(X(X)) is true, which in tum is the case iff XUX) is a refutable 
sentence). In particular, the sentence K(K) is true if itis refutable. Thus, itis either 
true and refutable, or false but not refutable. If it were true and refutable, then its 
negation would be false and provable, contrary to the assumption that the system is 
correct. Hence, it mast be that the sentence is false but not refutable, and therefore 
its negation is true but not provable, and so again there is a sentence that is true but 
not provable in the system, 

Informally, the sentence Q((Q) described earlier can be thought of as saying, ‘I am 
not provable,’ whereas the sentence K(K) says, ‘I am refutable.” 

Going back to the conditions T,, T; and T,, notice that, in proving that these 
conditions do hold for the systems studied by Godel, the proofs of the first two are 
relatively simple, but the proof of T; is extremely elaborate! 





42.1. Tarski’s Theorem 


Assuming conditions T and T, can there exist a predicate H such that for every 
sentence X, the sentence H(X) is true iff X is true? Such a predicate, if there is one, 
would be called a truth predicate. Well, Tarski’s celebrated result (1956 [1936]) is 
that there cannot be such a predicate (assuming conditions T, and T,), because if 
there were such a predicate H, then for any predicate K, the sentence H*(K) would 
be true iff H(K(K)) were true, which, in turn, would be so iff H(K(K)) were false, 
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Which, in tum, would be the case iff K(K) were false (since H is a truth predicate) 
In particular, taking H* for K, gives the absurdity that H*(H") is true iff H"(H") is 
false! Therefore, assuming T, and T;, there cannot be a truth predicate. 

‘Notice that, from Tarski’s theorem and condition T,, it is immediate that the 
system must be incomplete (assuming correctness), because if truth and provability 
coincided, then by T,, the predicate P would be a truth predicate, which it is not, 
by Tarski’s theorem, Since they do not coincide, then cither some truc sentence 
is not provable or, contrary to correctness, some provable sentence is not true, 
and so it must be that some true sentence is not provable. (However, this proof 
does not establish the more specific fact that P*(P*) isa true sentence that is not 
provable.) 


422. Gédel’s original proof 


‘As noted, the above proof came later than Godel’s original proof, since Tarski’s 
definition of truth came five years later (1936). Now, to the method actually used 
bby Godel: 

Each formal proof of the system consists of a finite sequence of sentences con- 
structed according to purely mechanical rules. Godel assigned numbers not only to 
expressions, but aso to proof. And now, for any positive integer m and any sentence 
X, Xis said to be provable at stage m iff there is a proof of X whose Godel number 
isn, As before, there is a predicate P such that for any sentence X, the sentence 
P(X) informally expresses the proposition that X's provable, But aso, for each positive 
integer », there isa predicate P, such that for any expression X, the sentence P(X) 
informally expresses the proposition that X is a sentence provable at stage n. In the 
systems subject to Godel"s argument, these three conditions hold: 


Gy: For any positive integer m and any sentence X, if X is provable at stage m, 
then P(X) is provable, and if X is not provable at stage #, then P(X) is 
refutable 


Gx: Tefor at least one m the sentence P,(X) is provable, then P(X) is provable, 


Gy: For any predicate H and any expression X, if either of the two sentences 
H'(X) and H(XUX)) is provable, so is the other, and if either one is 
refutable so is the other. 


Informally, the idea behind G; is that, at any stage, the system has memory, 90 to 
speak, of what has and what has not already been proved. And so, if X is proved at 
stage m, then (perhaps at a later stage) P,(X) will be proved, and if X is not proved 
at stage m, then P(X) is false and will sooner or later be refuted. As for Gyy it is 
‘obvious that if X is provable at stage », then itis certainly provable at some stage or 
other, and so if P(X) is true, so is P(X). The point now is that the system is strong 
‘enough so that if PX) is provable, then P(X) is not only true, but actually provable 
in the system. 
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[As already indicated, under the intended interpretation, P(X) asserts that X is 
provable (at some stage or other) whereas P,(X) asserts that X is provable at stage 
rn. Now, suppose that there is a sentence X such that P(X) is provable, yet each of 
the infinitely many sentences P(X), P(X)... PAX)y.--are refutable. Now if 
P(X) is true, then X must be provable at some stage or other, hence it cannot be 
that all the sentences P(X), P,(2),.-. P(X), ate false. In other words, if X is 
provable at some stage or other, then it cannot be that for every number m, the 
sentence X is not provable at stage n! Therefore, if P(X) is provable, but if also for 
cach m, P,(X) is refutable, then the system is certainly not correct (with respect to 
the intended interpretation), but this does not mean that the system is necessarily 
inconsistent (which means that some sentence and its negation are both provable). 
Well, if there is some sentence X such that P(X) is provable and also all the 
sentences P(X), Py X),..-4 PAX), ate refutable, then the system is called sn- 
stable, otherwise the system is called stable, As noted, instability does not necessarily 
imply inconsistency (it is known that there are unstable systems which are neverthe: 

less consistent), but stability certainly implies consistency, because if a system of the 
type considered is inconsistent, all sentences are provable, which, of course, implies 
instability. Also, if the system is correct, it must also be stable. Thus correctness 
implies stability which, in turn, implies consistency, and $0 stability is a property 
midway in strength between correctness and consistency. 

Notice that stability is a special case of the condition known as e-consistency 
(omega-consistency): A system is called «inconsistent if there is some property of 
(whole) numbers such that, on the one hand, it can be proved that there exists somie 
fnumber or other that has the property, but, on the other hand, for any particular 
number 1, it is provable that m does not have the property 

Now, Gédel’s incompleteness proof did not require the assumption that the 
system was correct (Which was required in the last proof) but only that the system 
was o-consistent ~ or at least, stable. In fact, Godel constructed a sentence G from 
which it ean be shown that if the system is consistent, then G is not provable, and if 
the system is stable, then G is also not refutable, Well, as shall be seen, such a 
sentence can be constructed from just the conditions G,, G, and G,, Infact, it turns 
‘out that P4(P?) is such a sentence, 

To begin with, note that the systems under discussion are such that, for any 
predicate H and any expression X, the sentence AUX) is refitable iff HUX) is 
provable (because Hi.X) is the negation of H(X)). In particular, taking P* for H, 
the sentence P*(X) is refutable iff P'(X) is provable, which, in tum, is the case 
iff P(X(X)) is provable (by G,). Thus, for any expression X, the sentence P(X) is 
refutable iff P(X(X)) is provable. In particular, taking P* for X, P*(P*) is refutable 
iff P(PP*)) is provable. And so taking G to be the sentence P*(P*) gives the 
following key fact: 











is refurable iff P(G) is provable. 


Now, suppose G is provable. Then it is provable at some stage m, hence PG) is 
provable (by condition G,) and therefore P(G) is provable (by condition G,). And 
50, if Gis provable, so is P{(G). But by the above key condition, if P(G) is provable, 
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then Gis refuatable, and so if G were provable, it would also be refutable, which means 
the system would be inconsistent! Therefore, if Gis consistent, then Gis not provable. 

Now, suppose that the system is stable. Then it is also consistent, and hence G is 
not provable, as seen above, and so Gis not provable at any stage ». Therefore, by 
condition G,, for every n, the sentence P.{G) is refutable. Hence, assuming stability, 
P(G) is not provable, and so again by the above key fact, G is not refutable, This 
proves the following result: 


Proposition G A Generalization of Gadel's Theorem Any stable system satisfying 
conditions G,, G; and G, must be incomplete. 


More specifically, under conditions G,, G, and G,, there is a sentence G (namely, 
P*(P*)) with the following two properties: 


1 If the system is consistent, then G is not provable; 
2. Ifthe system is stable, then G is neither provable nor refutable. 


423. The Roser Incompleteness Theorem 


Rosser (1964 [1936]) showed the surprising result that for the systems proved 
incomplete by Godel, it was not necessary to assume stability (or more generally, 0 
consistency), but the assumption of (simple) consistency sufficed, Now, he did not 
show that under the assumption of consistency, Godel’s sentence G was neither 
provable nor refutable, but rather that some other (more complex) sentence Z was, 

Continue to assume conditions G, and G,, but G, is now irrelevant. In fact, 
Rosser dit not use Goxe!’s predicate Pat all, but rather a more complex predicate H 
which informally has the following meaning: 


For any sentence X, the sentence H(X) informally means that for every number 
1, if Xs provable at stage m, then for some number m less than or equal to m, the 
sentence X is provable at stage m 


Now, Rosser showed for the systems studied by Godel, that not only conditions G, 
and Gy hold, but also that for any sentence X and any number n, these two con 
ditions hold: 


Ry: IF P,LX) is provable and P(X) is refutable for every m less than or equal 
to m, then H(X) is refutable. 


Ry: If PCR) is provable and P(X) is refutable for every m less than or equal 
to m, then H(X) is provable. 


Ie will soon be shown that from conditions G,, G,, R, and R,, it follows that if 
the system is consistent, then some sentence is neither provable nor refutable in it. 
But first, it should be pointed out that conditions R, and R, are correct under the 
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intended meanings of the predicates P, and H, in that these two conditions hold if 
‘provable’ is replaced by ‘true,’ and “refutable” is replaced by “false.” To sce this, 
consider Ry: H(X) says that for every number m such that X is provable at stage », 
there is some m less than or equal to m such that X is provable at stage m. Thus, to 
say that H(X) is fase is to say that there is at least one m such that X is provable at 
stage , but there is no m less than or equal to m such that X'is provable at stage m. 
Well, this is precisely the case if P.(X) is provable and if P(X) is refutable for each 
‘m less than or equal to m (because X is then provable at stage m and X is nor 
provable at any stage m where m is less than or equal ton), and so H(X) is then 
false, Thus, R, is correct (under the intended interpretation of the predicates). As for 
R,, the argument isa bit more subtle: Suppose that P,(X) is true and that for every 
‘m less than or equal to m, the sentence P_{X) is false. Then, X is provable at stage 
1m and for every m less than or equal to #, X is not provable at stage m, Thus, for 
every number m, if X'is provable at stage m, then »m must be greater than », and so 
there isa number x less than or equal to m ~ namely » - such that X is provable at 
stage x, and so H(X) is therefore true. Thus Ry is also correct (under the intended 
interpretation of the predicates). 

Now for Rosser’s argument: Assume conditions G,, G,, Ry and Ry and aim to 
show that, if the system is consistent, then there isa sentence that is neither provable 
nor refutable in the system ~ in fact, it will be shown that "(H") is such a sentence, 





Step 1 If Xis provable, then H(X) is refutable, and if X is refurable, then H(X) is 
provable, Reason: Suppose X is provable. If the system is not consistent, then of 
‘course H(X) is refutable (every sentence is), so suppose that the system is consis 
tent, Then, since X is provable, X is not provable. Then X is provable at some stage 
1m, hence P,(X) is provable (by condition G,). Since X is not provable, then for every 
im less than or equal to » (in fact, For every m whatsoever) X is not provable at stage 
im, hence P(X) is refutable. Then, by condition R,, the sentence H(X) is refutable. 
‘Thus, if X is provable, H(X) is refutable. 

Now suppose Xs refutable. This means that X's provable. Ifthe system is incon- 
sistent, then certainly H(X) is provable, so suppose that the system is consistent, 
‘Then Xis not provable, hence there is some m such that is provable at stage #, but 
there is no m less than or equal to » (in fact, no m at all) such that X is provable at 
stage m. Thus, P,(X) is provable but for every m less than or equal to 1, P(X) is 
refutable (again by G,). Then by condition R,, the sentence H(X) is provable. 


Step 2 By condition G,, for any expression X, the sentence H'(X) is provable 
(refutable) iff H{XUX)) is provable (respectively, refutable). In particular, taking H* 
for X, note that H"(H") is provable (refutable) iff HUH *(H*)) is provable (refutable) 
‘Take Z to be the sentence H*(H") and note that Z is provable iff H(Z) is provable, 
and is refutable if H(Z) is refutable. 


Step 3 Suppose Zis provable, then H(Z) is also provable (by step 2), but also H(Z) 
is then refutable (by step 1): an inconsistency. Therefore, ifthe system is consistent, 
the sentence Z is not provable. 
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Suppose Z is refutable. Then H(Z) is on the one hand refutable (by step 2), but 


also provable (by step 1): another inconsistency. Thus, if the system is consistent, 
then Zis not refutable either. 


Discussion Gédel’s sentence G might be thought of as saying, 


Tam not provable. 


and (assuming consistency) is therefore true (under the intended interpretation of 
the predicate P) by virtue of the very fact that it is not provable in the system! 
Rosser's sentence Z, on the other hand, is a more complex one which might be 
thought of as saying, 


‘At any stage that I am provable, there is an earlier stage at which I am refutable 
Since Zis not provable at all (assuming consistency), then what it says is really 40, 
hence Zis true (under the intended interpretation of the predicate H). 

4.24. Parikh sentences 
Suppose that a sentence X is provable by stage n (as distinct from ar stage m) if X is 
provable at stage m, for some m less than or equal to n. Now, given predicates P,, 


Phy. Py satisfying conditions G,, itis easy to obtain, for each m, a predicate S, 
such that for any sentence X, two conditions hold 


By: If Xis provable by stage m, then S,(X) is provable 
By: IF Xis not provable by stage m, then S,(X) is refutable, 





Note that if the system is consistent, then the converse of conditions B, must also 
hold, because suppose that S,(X) is refutable. If X were provable by stage n, then by 
B,, $,(X) would be provable and the system would be inconsistent. Thus: 


By: S,(X) is refurable iff X is not provable by stage m. 


Now, Parikh (1971) made a very interesting observation: For each m, let Y, be the 

By G,, the sentence Y. is provable iff $,(¥,) is provable. Thus, Y, 
is provable iff §,,) is refutable, But also, assuming consistency, $, Y,) is refutable 
iff ¥, is not provable by stage m (by B;), and therefore, ¥. is provable iff ¥, is not 
provable by stage m. Thus, either ¥. is not provable and also provable by stage 1, 
which is absurd, or provable but not by stage m. And so, ¥, is provable, but not by 
stage m. There is thus 2 uniform method for obtaining, for each number ma 
sentence ¥, that is provable but not by stage m! More interesting yet, the proof that 
Y, is provable is relatively short and provable ata relatively earlier stage. That is, for 
1 sufficiently large, the sentence PLY.) (where P is Godel’s provabilty predicate), 
which asserts that Y, is provable - this sentence P(Y.) is provable at a stage m, 
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where m is less than m (considerably less than m, for m sufficiently large). This might 
be paraphrased by saying that, for sufficiently large m, the sentence Y, is promably 
provable long, before it is provable! 


42.5. Gidel’s second incompleteness theorem 


Let fbe any refutable sentence of the system (Godel chose for fthe sentence 1 = 0). 
If fwere provable, then every sentence would be provable and the system would be 
inconsistent. Conversely, ifthe system were inconsistent, then f would be provable. 
“Thus, the system is consistent iff is not provable, and so, the sentence PCJ) is true 
iff the system is consistent, and so this sentence can be thought of as expressing the 
consistency of the system. This sentence has an important role and has been dubbed 
*Consi’ This sentence Conss is an arithmetic one that expresses the consistency of 
the system, Now, an interesting question arises: Is the sentence Consis itself provable 
in the system? This is tantamount to asking whether the consistency of the system is, 
provable within the system. 

Before answering this, it should be pointed out that the technical symbol for 
implication is *2". For any sentences X and Y, the sentence X> Y is read ‘If X, 
then ¥°. Obviously, anything implied by a true proposition must be tru, and so if 
Xand X'D Y are both true, so is Y. Wel, in any of the systems under consideration, 
if X and XD Y are both provable in the system, so is Y. Now, G expresses the non: 
provability of G itself and Consis expresses the consistency of the system, and 80 the 
sentence Consis > G expresses the proposition that if the system is consistent, then 
G is not provable, Well, this is the first half of Godel’s (first) incompleteness 
theorem, and s0 the sentence Consis > G is indeed true. Moreover, the sentence 
Consis > Gis not only true, but is even provable in the system! (The demonstration 
Of this is extremely elaborate!) Therefore, if Consis were also provable, then from the 
two sentences Consis and Consis > G, one could infer G, and hence G would be 
provable, which would mean that the system was inconsistent (by Godel’s Fitst 
Incompleteness Theorem). And so, if the system is consistent, then the sentence 
Conssis not provable in the system! Thus, the system, if consistent, cannot prove its 
‘own consistency. This is the result known as Gédel's Second Incompletencss Theorem. 

Unfortunately, Gadel’s second theorem has sometimes been misinterpreted to 
‘mean that one can never know that mathematics is consistent! To see how wrong 
that is, suppose it had turned out that Consis was provable in the system - or to be 
more realistic, imagine considering a system that could prove its own consistency, 
Would that be any reasonable grounds for trusting the consistency of the system? Of 
course not! If the system were inconsistent, then it could prove every sentence ~ 
including the statement ofits own consistency! To trust the consistency of a system 
fon the grounds that it can prove its own consistency is as foolish as trusting a 
person's veracity on the grounds that he claims that he never lies. And it is likewise 
irrational to have doubts about the consistency of a system just because it cannot 
prove its own consistency. The fact that a system, if consistent, cannot prove its own 
consistency sheds not the faintest light on whether the system is consistent. Whether 
a given system is consistent must be judged on other grounds 
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4.3. The Unsolvable 


‘The philosopher Leibniz envisioned the possibility that one day a universal caleu- 
lator would be found that would mechanically solve all mathematical problems (as 
well as all philosophical ones). Is this dream of Leibniz realizable? To answer this, 
some background is necessary 

‘There is a type of computing machine whose function is to generate a set of 
(positive whole) numbers. (From now on, number" will mean positive ineeger.) For 
example, a machine might be programmed to do two things: 


1 Print the number 2. 
2 Whenever you print a number x, follow it by printing x +2. 


‘These are the only two instructions the machine has. Then, obviously, the machine 
will successively print the numbers 2, 4, 6, 8 ~ that is, it will generate the set of even 
‘numbers. If, on the other hand, the machine's first instruction had been to print 1, 
instead of 2, then it would have generated the set of odd numbers. For any set A of 
(positive whole) numbers, say that a machine M enumerates or generates A if M 
prints out all numbers in A but no number outside A. And a set A is called 
recursively enumernble (an alternative term might be mechanically generatable) if 
there isa machine M that generates it. Examples of recursively enumerable sets 
abound in mathematics (as examples, the set of even numbers, the set of odd 
‘numbers, the set of prime numbers, the set ofall numbers divisible by 3 ~ virtually 
all the sets that are dealt with in number theory). 

For any set A, let A’ be the set of all numbers that are mat in A. The set A’ is 
called the complement of A. For example, the complement of the set of even num: 
bers is the set of odd numbers. Now, a set A of numbers is called sirable or 
recursive if both A and its complement A’ are recursively enumerable. The reason 
for the word ‘solvable’ is this: Suppose one machine M generates A and another 
machine N generates the complement A’ of A. Thus, for any number m, there is an 
effective test to see whether m belongs to A: Set both machines going simulta 
neously and wait for the numbers. If w is in A, then sooner or later, it will become 
clear, since M will eventually print ». If m is not in A, this will also become clear, 

1N will sooner or later print . Thus, there is a mechanical ‘solution’ ro the 
problem of which numbers are in A and which ones are not. 

Now suppose a set A is recursively enumerable but not solvable. Then there is a 
machine M that generates A but no machine to generate the complement A’ of A. 
Suppose again that one would like to know whether a given number 1 is or is not in 
‘A. The best that can be done isto set the machine going and hope for the best: If 
‘nis in A, sooner or later this will become known, since M will sooner or later print 
it, but if m is not in A, then AM will never print #. But no matter how long is waited, 
there is no assurance that M might not print m at some later time. Thus, if mis in A, 
sooner or later this will be known, but if » is not in A, then at no time can one 
definitely know that it isn’t (at least by observing only the machine M). Such a set 
Ais aptly called semi-solvable. 
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‘There are infinitely many of these possible generating machines ~ as many as there 
are positive integers. Each of these machines has a program which is not given in 
English, but is coded into a positive integer, and matters can be arranged so that 
every positive integer is the code of some generating machine (but it may be that a 
‘machine may have several different code numbers). For each number 1, let M, be 
the machine whose code is m, and imagine all the machines listed in the infinite 
sequence Miy May =. 5 May 

‘Also code every pair (x, 7) of numbers to a single number x * y (a simple coding 
device that works is to take * y to be a string of Is of length x followed by a string 
of 0s of length 7 ~ for example, 3 +4 would be the number 1110000). 

“The first important feature of this battery of generating machines is that one of 
them U is a so-called universal machine which is programmed to systematically 
‘observe the behavior of all the machines Mj, Mj, My... 5 Mls.» and whenever a 
machine M, prints out a number y, the universal machine U reports this fact by 
printing the number x* y, and these are the only numbers that U ever prints. Thus, 
for any numbers x and y, the machine U prints x-* y iff M, prints y. 

For example, suppose M; is programmed to generate the set of even numbers 
and M, is programmed to generate the set of odd numbers. Then U will print the 
numbers 7 * 2,7 * 4, 7 * 6, etc. and also the numbers 9 # 1,9 * 3,95, etc. but U 
will never print 7 * 3, oF 9+ 4. 

‘This universal machine is a wonderful thing, in that access to this one machine U 
is as good as having access to the entire infinite battery of the generating machines, 
since whenever machine M, prints y, one will know it by just observing the behavior 
‘of U, which will then print x* y. 

A second important feature of these machines is that to each machine M, there is 
associated a machine M, which is said to diagonalize M,, such that Mf, prints those 
and only those numbers x such that M, prints x* x MP, “keeps watch,’ 40 t0 speak, 
‘on M, and is instructed to print out x whenever M, prints x* x 

Let us record these two vital facts. 








Fact 4.1 The universal machine U prints those and only those numbers * y 
such that M, prints 


Fact 4.2 For each machine M, its diagonalizer M, prints those and only those 
numbers x such that Af, prints x* x 


Let K be the set generated by the universal machine U. This set K (which has 
been dubbed the complete set) will be seen to contain all the information about all 
‘mathematical systems: If one could solve this one set K, then one could solve all 
questions about all mathematical systems, so the vital question is this: Is the set K 
solvable? Since U generates K, the question is thus whether there is a generating 
machine that generates the complement K” of K. In other words, is there a generat- 
ing machine that prints all and only those numbers that the universal machine U 
does not print? 

Well, let M be any one of the generating machines. By fact 2, there is a number 
such that M, diggonalizes M. Now, for any number 4, the universal machine U 
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prints + iff M, prints x, which, in turn, is the case iff M prints x* x. Since this 
holds for every number x, then, in particular, it holds when x is the number , and 
50 U prints b* biff M prints b+ h Thus, if we let & be the number b+ k, we see that 
cither U and M both print & or neither one prints &. Thus, it is mot the case that Af 
prints k iff U does not print &, and so it is not the case that for all numbers x, the 
machine M prints x iff U does not print x. Thus, there is no generating machine that 
prints those and only those numbers that U does not print! 

Letting K be the set generated by U, it has just be shown that there is no 
ienerating machine M that generated the complement K’ of K. This leads to the 
following basic result: 


Proposition 4.1. The set K generated by U, though recursively enumerable, is 
not solvable (not recursive) 


Now, let us see some of the important ramifications of Proposition 4.1: First of 
all it yiels yet another method of proving the incompleteness of the type of sys 
tem under consideration: A predicate H is said to represent (in the system) the set 
‘of all numbers m such that [1] is provable in the system. Thus, for any set A of 
‘numbers, to say that HT represents A is to say that for every number m, the sentence 
H{n} is provable in the system iff m is a member of A. And a set A is said to be 
‘representable in the system if some predicate H represents it, Now, there is a variety 
of systems called formal systems having the property that every set representable in 
the system is recursively enumerable. Thus, for every predicate H of such a system, 
there is a machine Mf that prints those and only those numbers m such that H[] is 
provable in the system. All of the systems investigated by Gédel are indeed formal 
systems, Moreover, all recursively enumerable sets are representable in each of these 
systems (assuming consistency). A system is called Gédelian if itis formal and if all 
recursively enumerable sets are representable init. Thus, the sets representable in a 
Godelian system are precisely the recursively enumerable sets ~a set is representable 
in the system iff it is recursively enumerable. Note that a Gadelian system is auto- 
matically consistent, since if a system is inconsistent, then the only representable set 

the set N of all positive integers, and moreover every predicate H represents N, 
since H[] is provable for every number m (all sentences are provable), Thus, a 
Godelian system is consistent. 

Well, Proposition 4.1 has a very important consequence for Godelian systems: 
Consider any Gédelian system. Since all recursively enumerable sets are represent: 
able in it, then, in particular, the sct K generated by the universal machine U is 
representable in it. Let H be a predicate of the system that represents K. By Proposi- 
tion 41, the complement K” of K is not recursively enumerable, hence not repre- 
sentable in the system (since only recursively enumerable sets are representable in a 
Godelian system), In particular, the negation HT of H fails to represent the set K°, 
‘This means that either H1[n] fails to be provable for some m in K", or there is some 
‘not in K* (and hence in K) for which F1[»] is provable. But if H[] were provable 
for some » in K, the system would be inconsistent, because H[ 1] is provable for 
every min K, and H[] and H[1) cannot both be provable in a consistent system. 
Since the system is assumed consistent (being Godelian) it cannot be that [1] is 
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provable for some m outside K’, and, therefore, it must be that H[1] fils to be 
provable for some m in K’. The sentence Hn] is not provable for such an » cither, 
since m is not in the set represented by H. And so, neither H[1] nor its negation 
Ala] is provable in the system. 

‘Thus Proposition 4.1 yields: 


Proposition 4.2 Every Godelian system is incomplete. More specifically, in a 
Godetian system, there is a predicate H that represents the set K generated by U, 
and there is a number » in K" such that neither H{n] nor its negation FI{n) is 
provable in the system. 


Having now seen how Proposition 4.1 is related to Gédel’s theorem, how is it 
related to Leibniz’s vision? 

‘Any formal mathematical problem can be translated into a question of whether a 
machine M, does or does not print out a number m, That is, given any formal 
system, one can assign Godel numbers to all the sentences of the system and find 
‘number » such that the machine M, prints out the Godel numbers of the prov- 
able sentences of the system and no others. And so, to find out whether a given 
sentence is oF is not provable in the system, for its Godel number m one can ask 
whether machine M, does or does not print m, or equivalently, whether the uni- 
versal machine does or does not print the number m* m, Therefore, a complete 
knowledge of U would entail a complete knowledge of all formal systems. Con 
versely, any question of whether a given machine prints out a given number can be 
reduced to a question of whether a certain sentence is provable in a certain system, 
because one can take a formal system in which the set K is represented by a 
predicate H, and so for any number m, the machine U prints » iff H{n] is provable 
in the system, Thus, a complete knowledge of the set K is tantamount to a complete 
knowledge of all formal mathematical systems. 

Now, what does all this mean with respect to Leibniz’s vision? Strictly speaking, 
‘one cannot prove or disprove the feasabilty of Leibniz’s hope, because it was not 
stated in an exact form. Indeed, no precise notion of a ‘calculating machine’ or 
‘generating machine’ existed in Leitmiz’s day; these notions have been rigorously 
defined only in this century. They have been defined in many different ways by 
various mathematical logicians (including Gadel), but all these definitions have been 
shown to be equivalent. If by ‘solvable’ is meant solvable according to any of these 
equivalent definitions, then Leibniz’s hope is not realizable, because the fact simply 
is that there is a universal machine U and each machine has a diagonalizer, hence 
Proposition 4.1 does hold and thus the set K generated by U is not solvable; it is 
only semi-solvable (recursively enumerable). Therefore, there is no purely ‘mechan 
ical’ procedure for finding out which sentences are provable in which formal systems, 
Thus, any attempt to invent a clever ‘mechanism’ that will solve all mathematical 
problems is simply doomed to failure. 

In the prophetic words of the logician Post (1964 [1944], this means that math- 
‘ematical thinking is, and must remain, essentially creative. Or, 25 commented by the 
‘mathematician Rosenbloom (1950), it means that man can never eliminate the 
necessity of using his own intelligence, regardless of how cleverly he tries. 
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Suggested further reading 


[A. systematic approach to Godel’ theorems presupposes some fintorder logic. for which 
{(Sullan, 1995) isan introduction (emphasizing the relatively modern method of tableau). 
"The fit eight chapters constitue suficient preparation for Smullyan (1992). [See also chap 
ter 1 of the present volume.] The author's (1992), though completely igorous, contains 
‘what is probably the simplest available proof of Géde!'s Theorem. The usual machinery of 
primitive recursive Functions and the Chinese Remainder Theorem are completely avoided. 
‘Smuliyan (1994) ia very comprehensive treatment of many topics bearing on Self-Reference, 
including abstract self reference, elementary formal systems, incompleteness theorems, recur 
sion theory and combinatory logic. It combines an introduction with a presentation of new 
results in these ikls. Only one chapter, not even necesary for any of the others, presupponcs 
any knowledge of fist-order logic; all the other chapters are self-contained. Quine (1940, 
‘ch. 7; 1946) was the orginal inspiration for the modem approach to Gekle!'s theorems used 
in Smullyan (1992). It introduces the interesting subject of Pratenway Sei Applied. Quine 
(1946) offers a very clever and influential idea subsequently used in the simplification of 
Godel’s proof 
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Chapter 5 
Truth 
Anil Gupta 


5.1. Introduction 


‘The concept of truth serves in logic not only as an instrument but also as an object 
of study. Eubulides of Miletus (fl fourth century BCE), 2 Megarian logician, dis: 
covered the paradox known as ‘the Liar,’ and, ever since his discovery, logicians down 
the ages ~ Aristotle and Chrysippus, John Buridan and William Heytesbury, and 
Alfred ‘Tarski and Saul Kripke, to mention just a few ~ have tried to understand the 
puzzling behavior of the concept of truth.’ 

In Eubulides’ paradox, itis supposed that a person X says 


What 1 am now saying is false 


and he says nothing more. The supposition is plainly coherent, but it leads via highty 
plausible arguments to absurd conclusions. If what X says is true, then Xs state- 
‘ment must be assessed to be false (because X claims to have said something false) 
But if what X says is false, then X°s statement must be assessed to be true (because, 
again, X claims to have said something false). The truth of Xs statement implies, 
therefore, its falsity, and the falsity of 27s statement implies, in turn, its truth. ‘The 
criginal supposition seems thus to imply a contradiction. 

‘The very simplicity of Eubulides’ paradox has provoked numerous simple ‘solu 
tions’ of it. There is, for instance, the idea (put forward by Bar-Hillel and others) 
that the paradox is removed, and the entire problem solved, simply by noting that 
truth is a property of propositions (not of sentences) and that X’s words do not 
express a proposition.’ But the solution fails. First, the paradox reappears if X's 
statement is reformulated a little: 


Tam not now expressing a true proposition. 


If X°s words do not express a proposition, they do not express a true proposition 
Hence what X says should be assessed to be true, and one is again on the path to a 
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contradiction. Second, paradoxical behavior is sometimes exhibited by contingent 
statements, and it is not plausible to maintain that these do not express proposi- 
tions. St. Paul attributes, in his Epistle to Titus, the following, remark to the Cretan 
Epimenides: 


‘The Cretans are always liars 


Let us assume that “ar” applies to people who have never uttered a truth (even 
‘unintentionally), Then, as has often been noted, the Epimenides remark is paradoxical 
if all other Cretan utterances happen to be untrue, and the remark is not paradoxical 
(but simply fale) otherwise. Suppose that no other Cretan utterance is true, Then 
the Epimenides remark is contingently paradoxical, but it expresses a proposition, 
For the remark can be embedded in a true belief attribution ~ for example, 


St. Paul believed that the Cretans were always liars 


4 belief attribution that can explain some of St. Paul's behavior (Gupta and Belnap, 
1993, pp. 7-12). 

‘The central problem posed by the Liar paradox remains whether one takes prop: 
sitions or sentences to be the bearers of truth. In fact, the major current theories of 
truth and paradox can be formulated for either type of truth bearer. So, for simplic: 
ity, assume sentences to be the bearers of truth; this allows one to bypass the theory 
Of propositions. If context-sensitive elements are present, truth can be treated as a 
relational property: a sentence will count as true relative to the relevant contextwal 
clements. The complexities of paradox appear even in languages without context: 
sensitive elements, and, for the most part, attention is restricted here to such languages. 

‘The paradox brings to the surface a problem with the basic principles governing, 
truth, the principle that a sentence A follows from its truth attribution **A” is true’ 
(Truth Elimination) and the converse principle that ‘A’ is true’ follows from A 
(Truth Introduction). The principles can be combined, following Tarski, into the 
‘Tesehema: 


(T) ‘AP is true if and only if A. 
Instances of the ‘T-schema will be called ‘T-biconditionals’. The paradox shows that, 
in the presence of certain kinds of self-reference, the T-biconditionals imply a con. 
tradiction. Suppose that [denotes ‘Tis not true’, 80 says of itself that itis not true.’ 
Then, 
1=*Tis not true’ (5.1) 
Now the T-biconditional for 4 


“Lis not true’ is true if and only if Fis not true 


and (3.1) together imply a contradiction within classical logic. 
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Godel showed that self-referential statements can always be formulated ina 
language that contains certain syntactic resources (such as concatenation and names 
for symbols). [See chapter 4.] The problem the paradox presents can therefore be 
formulated as follows: the combination 


‘T-biconditionals + Syntactic richness + Classical Logic 


leads to contradictions. None of these elements is easily abandoned ~ not the T- 
biconditionals, not the syntactic resources, and not classical logic. The paradox thus 
creates a difficult problem. It has led some (c.g., Chihara and Priest) to espouse the 
inconsistency view of truth, the view that the rules governing truth are inconsistent, 
But this move only deepens the mystery created by the paradox. The concept of 
truth is used in ordinary, everyday situations without falling into incoherence. How 
is this achieved, working with an inconsistent concept? How can an inconsistent 
concept do useful work? Notwithstanding the paradoxes, itis posible to make clear 
and unproblematic truth-attributions, for example, to the sentences ‘Snow is white’, 
“‘Snow is white’ is true’, and ‘Some of Aristotle's claims are not true’. How can this 
bbe done? The primary challenge that the paradox poses is to provide a better logical 
‘understanding of the concept of truth, one that explains the ordinary and the 
extraordinary behavior of the concept in a rich setting (rich syntax and, for example, 
full classical logic). The argument of the paradox can be blocked in numerous ways 
It is possible, for instance, to weaken syntactic resources or eliminate negation from 
the language. But such moves do not meet the challenge. 

‘The argument of the paradox not only raises a good problem; it serves a constructive 
purpose as well. It figures in Tarski’s proof of his indefinability theorem: the set of 
Godel numbers of the truths of arithmetic is not the extension of any arithmetical 
formula; more briefly, arithmetical truth is not definable within the arithmetical 
fanguage [sce chapter 4]. The arithmetical language £ can be extended to a new 
language &’ by adding co it a predicate that expresses arithmetical truth (i, true in 
£,' or more explicitly, ‘Godel number of a true sentence of £:). But the set of Godel 
‘numbers of the truths of £’ will not be definable in. One can move to a richer 
language, £”, by adding to £’ a new predicate that expresses ‘true in ” But, again, 
truth for £” will not be definable in c*. The move of adding new truth predicates 
can be repeated indefinitely and generates a Tarskian hierarchy of increasingly richer 
languages, £, 2, £", 

‘The Tarskian hierarchy of languages and truth predicates can be constructed by 
beginning with any interpreted base language that is free of truth and related 
notions. Let £o be such a language and let the expressions ‘snow’, is white’, “grass, 
and ‘is green’ belong to it. Then the next language in the hierarchy, £, contains a 
truth predicate, ‘true,’ that applies to all the true sentences of Ly. For example, 
‘true,’ applies to sentences such as 





Snow is white. 
and 


Grass is green 
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and not to sentences such as. 
Snow is green. 
and 
‘Snow is white’ is true, 


4a, the next higher language, contains a truth predicate, ‘true;’, that has a wider 
application than ‘true,’ “True,” applies also to sentences of £, such as 





‘Snow is white” is true). 


that do not belong to 4 In general, a truth predicate of level m, ruc,’ applies only 
to lower-level sentences, Le. to sentences that belong t0 £4(m <n); it does not 
apply to sentences of level = w. The Tyschema for “rue,” 


(T,) ‘A’ is true, if and only if A 


hholds only for sentences A of levels < m. The restriction means that ‘true,’ is bound 
to be free from paradox. Suppose J, denotes‘ is not true,’. Then J, does not belong 
to the extension of ‘true,’ fr its level is too high. So /, is ot true,, But no contradic: 
tion follows, because the T,-schema cannot be instantiated with ‘/, is not true,’ 

“The hierarchy provides, then, an effective way of constructing paradox free concepts 
of truth, Until about the late 1960s, it was the predominant view in philosophical 
logic that the hierarchy provides the only way of making sense of the ordinary 
concept of truth: meaningful uses of “tue” must be seen as representing one of the 
truth predicates in a Tarskian hierarchy. This conception was regarded not so much 
as a view, but as something that was forced by the paradoxes ~ or by some related 
mathematical theorems (e.g., Tarski’s theorem) or some related principles of philo 
sophical logic (c., Russell's vicious circle principle). Truth attributions, it was held, 
Imad to be expressed in a *higher metalanguage.” In 1967, Prior wrote (p. 230): 


Further, Tarski argues, a sentence asserting that some sentence Sis a trae sentence of 
some language L cannot itself be a sentence of the language L, but must belong to 4 
smetalanguage in which the sentences of L are not used, but are mentioned and discussed. 
He is led to this view by the paradox of the “tar” 


Prior endorses the point he attributes to Tarski, as did many others when Prior 
wrote his article. 

It was in this environment that the groundbreaking and seminal work of Martin 
and Woodruff (1984 {1975]) and Kripke (1984 [1975}) appeared on the philo- 
sophical scene. These authors established, contrary t0 the prevalent dogma, that 
attributions of truth do not always force a move to a metalanguage. They proved via 
a fixed-point argument that certain three-valued languages contain their own truth 
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predicates. Kripke’s essay, in particular, introduced mathematical tools (e., his 
definition of "groundedness’) that were, and remain, fruitful.’ The “fixed-point” 
theories of Martin, Woodruff, and Kripke are presented in section 5.2. 

Fixed-point constructions work, as shall be secn, only when languages are expres: 
sively weak —i.e., only when certain logical or syntactic notions are not expressible in 
them. The fundamental problem raised by the paradoxes — the problem of interpret- 
ing truth in rich languages ~ thus remains. The theories developed by Herzberger, 
Belnap, and me ~ the revision theories ~ offer a solution to this problem, These 
theories are presented in section 5.3. 

‘Theories of the third group, the contextual theories, have been advocated by 
Parsons, Burge, Gaifman, and others. These theories make the interpretation of the 
truth predicate dependent on context. The motivating intuitions for this idea are 
explained in section 5.4. 

Before I turn to an exposition of these three types of theories ~ the fixed-point, 
revision, and contextual theories — let me stress the limitations of this essay. First, the 
literature on truth is vast and I have not even attempted to summarize it, In particular, 
the three main types discussed here encompass many different theories, and 1 have 
provided an exposition only of some representative members, Visser’s (1989) and 
‘Chapuis’s (1998) articles are valuable surveys for further information. Second, I have 
restricted myself ro semantical topics and issues. There is also a rich proof-theoretical 
side of the subject. A useful guide to this is provided by Sheard (1994), Third, 
Thave not attempted to explain the applications of the theories discussed here to the 
paradoxes of class, property, belief, rationality, and other concepts.* 


Preliminary matters. We shall identify interpreted languages £ with ordered-triples 
(L, M, p), where L-carties syntactic information about £, M provides interpretations 
for the non-logical constants of £ (M is called variously ‘model,’ ‘structure,’ and 
‘interpretation’, and p is the semantic scheme for determining the interpretations of 
compound expressions. Of particular interest are clasical languages, In these lan- 
inuages, the model M can be identified with an ordered pair (D, 1), where D, the 
domain, isa non-empty set and I, the interpretation function, assigns to each name 
4 member of D, to each mary function symbol a member of D* -+ D, and to each m: 

ary predicate a member of D*—+ {t, f]. The extension of an n-ary predicate G is the 
set of m-tuples that are assigned the value t by 1(G). We shall sometimes specify the 
interpretation of a predicate Gin a classical language £ by providing the extension of 
Gin £, We shall call the classical scheme ‘t’ and we shall assume Tarskian definitions 
of notions such as ‘object d satisfies a formula A(x) in < and ‘sentence A is true in 
L, [See chapter 1.] 


5.2. Fixed-Point Theories 


‘Martin, Woodruff, and Kripke showed via a fixed-point argument that certain three- 
valued languages can contain their own truth predicates. This section begins with a 
brief account of one of the three-valued languages - one based on Kleene’s strong, 





4 





Truth 


valuation scheme (henceforth Strong Kleene) ~ and then tums to fixed points and 
their significance. 

In a three-valued language, sentences can be true, false, or neither-true-nor-false. 
‘The semantic values of sentences include, therefore, not only the classical t (‘the 
true’) and f (‘the false’) but also the value m (‘the neither-true-nor-false’). The 
interpretation of an m-place predicate G in a three-valued model M (=(D, D) is a 
member of the set D+ [tf a. (Im other respects, three-valued models are exactly 
like the classical ones.) The extension of G in M is the set of m-tuples that are 
assigned the value t by 1(G), and the anticxtension is the set of m-tuples that 
are assigned the value f. Ifthe extension and the antiextension of G exhaust the set 
1D" ~ that is, if m does not belong to the range of 1(G) ~ then the interpretation of 
Gis said to be clasical, Ifall predicates receive classical interpretations in M, then M 
is a clasical model. By these definitions, all classical models count as three-valued 
but, of course, not all three-valued models are classical, 

‘The Strong Kleene valuation scheme, x, evaluates sentences for truth and falsity 
4s follows. If, in a model M, the terms f,,.... 4, denote the objects dj, 
respectively, then the semantic value of G(t),..-.%) in M is I(G)(dy, + 94,). $0 
Gly.) is true in M iff (if and only if) the sequence (dj,...,d,) belongs to 
the extension of G; the sentence is false iff, ... .d,) belongs to the antiextension; 











and it is neither true nor false, otherwise, The connectives ~ and & express the 
following functions: 
sf=t)-n=n ~t=f 
and 
(kat 


(a & a(t & m)=(n& aj=n 
W&N=(C&v)=f forall vE(t, fa) 





Finally, the universal quantifier V expresses a generalized conjunction. The other 
connectives (v, -+, ¢) and the quantifier (3) receive their standard definitions. It 
follows therefore that a disjunction, for example, is true iff one of its disjuncts is 
‘rue; the disjunction is false iff both the disjuncts are false; and the disjunction is 
neither true nor false, otherwise. For another example, 3xFx is true iff Fis truc of at 
least one object in the domain; it is false iff Fis false of all the objects in the domain; 
and it is neither true nor false, otherwise. (See Gupta and Belnap (1993, section 
2A), for a more leisurely exposition of the Strong Kleene and other schemes. [See 
also chapter 14.) 

‘Two properties of the Strong Kleene scheme x deserve notice. First, « respects 
classical semantics: if the components of a compound receive classical values, then 
x's assessment of the compound coincides with that of the classical scheme. Hence, 
the two schemes display perfect agreement on classical models. Second, the scheme 
has the following monetenicity property. Impose on the sct of values (t, f, n] the 

‘tm =f, and tand fare incomparable. (See 
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Figure 5.1 


figure 5.1.) This ordering yields pointwise an information ordering = on the inter- 
pretations of predicates:” Let M=(D, 1) and let h, h’ € D"—> (t, f, ). Then hs h’ 
iff, for all dy, ...,d, © D*, bid, --. ,d,) = b’(d,, -.. ,¢,). A model M’ (=(D’, 1’)) 
is said to be at least as informative as M (M <M’) iff D = D’, and I and 1’ assign the 
same interpretations to names and function symbols, and for all predicates G, 
1(G) = 1(G). Now the monotonicity property is this: if M = M’ then the sentences 
that are true (false) in M are also true (false) in M’. That is, 8 one moves to more 
informative models, there is never a reversal of earlier classical values, 

Let £ (= (L, M, ®); M=(D, D) be a Strong Kleene language and T'be a one-place 
predicate not in L. (Our aim is to interpret T as ‘true’.) Let L’ be L extended with 
T. Further, let the domain D of £ cootain all the sentences of L* (or the codes, €.8. 
the Godel numbers, of the sentences of L").* For h € D— [t, f,m), let M+h be the 
model just like M except that Tis assigned the interpretation h. Call M a ground 
model, and M+h a standard model, of L’, Finally, let £=(L', Moth, x) and say 
that Gis a T-predicate of £6, iff the extension of G in £, contains all and only the 
truths of &, and the antiextension of G contains all and only the falsehoods of £, 
(Adopt here the usual - and, in the present context, unimportant ~ assumption that 
rnonsentences fall among the falsehoods.) It is a remarkable property of the Strong 
Kicene scheme that an interpretation h can invariably be found under which T'is a 
‘T-predicate of £, ~ ie., one under which the interpretation of T coincides with that 
of ‘true sentences of £,’. 

A reformulation of this property will prove useful. Set H=D— It, f, n}. For 
U, VCD and UNV=@, let (U, V) be the unique member h of H such that 
Us=|d€D: h(d)=t] and V=|d ED: hid)=f}. Define the operation xy (‘the 
Strong Kleene jump for M’) on H as 








wy(h) = (U,V?) 


where U’ is the st that contains the true sentences of &,, and V" is the set that contains 
the false sentences of £, as well as the nonsentences. Observe that h is a fised point 
Of ky ~ ie, Ky(h) =h - iff Tis a T-predicate of £4, The remarkable property of the 
Strong Kleene scheme can now be stated thus: iy invariably has fixed points, 

‘Two proofs of the existence of fixed points are sketched here: one is algebraic and 
tives significane information about the structure of the fixed points of xy; the other 
relies on an iterative construction and reveals much about one particularly important 
fixed point. 





96 





Truth 


The first proof rests on the propertics of the structure (H, =) of the possible 
interpretations of T under the information ordering. This structure is a particular 
kind of poset (ic. a partially ordered set: = is reflexive, antisymmetric, and transi- 
tive). To specify the kind of poset itis, frst recall that 


‘+ yis an mpper bound of a subset Z of a poset (X, =) iffy € X and, for all 2 € Z, 
25; 

+ is the least element of Z iff y € Z and, for all 2 Z, y = 2; and 

*yisa maximal clement of Z iffy € Z and there is no z € Z such that z#y and 
ysz 


‘The notions of lower bound, greatst clement, and minimal element receive dual 
definitions (ie. definitions obtained by replacing ‘=" by “="), Recall also that 


+ y is the supremum, \/Z., of ify is an upper bound of Z and y isthe least of the 
upper bounds of Z, and 

+ y'is the infimum of Z iffy is a lower bound of Z and y is the greatest of the 
ower bounds of Z. 


Let us say, following Viser (1989, p, 656), that 


# Zs consistent iff every (u,v CZ has an upper bound, and 
+X, 5) is a caberens complete partial order (cope) iff every consistent subset of X 
has a supremum, 


Examples: The values |t, f, a] under the informational ordering constitute a expo 
(see figure 5.1). The dual of this structure - one obtained by flipping figure 5.1 
about the horizontal ~is not a ccpo. Finally, the structure (H, =) is a cepo ~ indeed 
it isa general fact that if (X, =) is a ccpo then the function space X° under the 
pointwise ordering induced by = is also a ccpo. Here are some farther properties of 
‘ecpos to note 





‘© First, every cepo has a least element ~ for @ is invariably consistent and so has a 
supremum, which is also the least element of the ecpo. 

© Second, every nonempty subset Z. has an infimum ~ for the lower bounds of Z 
are consistent and hence have a supremum, which is also the infimum of Z. 

© Third, every cepo has maximal elements. (This follows by Zom’s lemma; see 
Visser (1989) or Gupta and Belnap (1993, section 2C).) 


Observe that Ky is a monetone operation on (H, <) in the sense that, for all 
h, B'EH, if h=h’ then xy(h) = xu(h’). This follows immediately from the 
monotonicity of the Strong Kleene scheme: If h =’, then M+h is at least as 
informative as M-+h. Hence the sentences true (false) in M+h must also be truc 
(false) in M +h’. Consequently, (hh) = xy(b’). Now the following theorem implies 
the existence of the fixed points Of Kw. 
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Visser's Fixed-Point Theorem (Visser, 1989, p. 659). Suppose that fi X—+X 
is a monotone operation on a ccpo (X, ). Let F (= {x € X: f(x) =x}) be the set 
of fixed points of f, and let STF be the relation = restricted to F. Then (F, <tF) 
is a cepo. 


A compressed proof. Let us say that x € X is sound iff x = f(x). Further, for YC X, 
ler us identify the structure (Y, st) with the set Y. Now consider an arbitrary 
UCE such that U is consistent in the poset F. We need to show that U has a 
supremum in F. Observe that if a set YC X is consistent, and if Y contains only 
sound points, then the supremum \/Y of Y is sound. (Argument: For arbitrary 
YEY, ySV¥. So, by the monotonicity of f, fly) =f(\V¥). Since y is sound, 
ys fly) = (VY). So f(V¥) is an upper bound of ¥, Hence V¥ = f(\V¥).) It fol- 
lows that if x is sound, then the set of sound points z = x is a ccpo. Now U is a set 
that contains only sound points. So \/U is sound and the set V = [x € X: x = f(x) 
and \/U = x} isa cepo. Therefore V must have a maximal clement v. We know that 
\VU Svs f(y). By monotonicity and the soundness of VU, VU = f(\VU) = flv) 
= f(f()). So f(y) € V. Since v is a maximal clement of V, f(v) =v. This implies that 
the set W= [x € X: f(x) =x and \/U =x] is nonempty. Hence W must have an 
infimum 2 in the cepo X. This element z is a fixed point and is also the supremum 
of U in the eepo F. (Argument: For arbitrary w € W, z= w. By monotonicity, 
f(z) = f(w) sw. So f(z) is a lower bound of W, and thus f(z) =z. It is easily 
verified that VU = f(z). So f(z) € W. Bat zis a lower bound of W and f(z) = 2. So 
{(z)= 2. Every fixed point that is an upper bound of U belongs to W. So z must be 
the supremum of U in F.) QED 

‘The number of fixed points of xy, can range from 1 to the cardinality of the 
continuum ~ the number depends upon the sorts of vicious reference exemplified in 
M. The theorem above shows that the fixed points of Ky, irrespective of their 
‘number, always constitute a cepo, This tells us, among other things, that Ky has 
‘maximal fixed points, that every nonempty set of its fixed points has an infimum, 
and that ¥, has a least fixed point. 

‘The second proof of the existence of the fixed points of ky proceeds via Kripke’s 
iterative construction of the least fixed point, (Incidentally, Martin and Woodruff 
proved the existence of maximal fixed points for Kleene’s weak valuation scheme 
[see chapter 14].) Let us see how the construction works by way of an example. 
‘Suppose that the nonlogical symbols of L are the one-place predicates Gand HY, the 
rnames a, 6, and ¢, and the quotational names *A’ for all A E 5, the set of sentences 
of L*, Fix M’ (=(D¥, 1’) to be the following classical model of L. 








D/=SU {0}. I assigns to the quotational names their intended interpretation; it 
assigns to the names a, b, and ¢ the denotations 0, ~Tb, and Te, respectively; and 
it assigns to G and H the extensions {0} and |T*T*Ga’’), respectively 


‘The iterative construction builds up an interpretation of Tas follows. At the initial 
stage, stage 0, the interpretation of Tis set as (2, )? This stage may be pictured 
4s representing our initial ignorance of the extension and the antiextension of T: 
Now, despite this ignorance, the scheme x can be used to determine the truths U; 
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and the falschoods V, of the model M’ + (@, Q). Iris easy to see that Ga, ~ Ha, and 
~VsxHs, for example, belong to U; and the negations of these sentences belong 
to Vj, Note that Ta and all other truth attributions are neither true nor false, for 

both the extension and the antiextension of T are @. At the next stage, stage 1, 
Tis assigned the richer interpretation (U,, V,) {= ¥ie((@, @))]. And the truths U; 
and the falsehoods V, of the model M’+(U,, V;) are determined as before. By 
monotonicity, U, CU, and V, C Vy. But some of the sentences that were caricr 
assessed a5 neither true nor false now receive a classical value, For example, Ta and 
T'Ga’ were earlier assessed as neither true nor false, but now the former is assessed 
as false and the later as true. At stage 2, Tis assigned the richer interpretation 
(U,, Va) [= Kie((U;, Vi))J- The process can then be repeated to construct yet richer 
interpretations of Tat higher stages. The process can be continued into the transinite 
by taking the interpretation at a limit stage a to be the supremum ~ (UseyUay 
UpcuV)~ of the earlier interpretations. (Since the eaflicr interpretations constitute a 
‘consistent set, the supremum is bound to exist.) The sentences of L* constitute a set, 
480 the iterative process must reach ‘closure’ at some stage @: the interpretation of T” 
at a must coincide with that at a+ 1. This interpretation is a fixed point ~ in fact, 
the least fixed point - of the Strong Kleene jump. In the example above, the clenure 
‘ordinal ~ the least ordinal at which closure occurs ~ is @, the first infinite ordinal 
‘The closure ordinal depends on the type of vicious reference in the ground model 
and can, in general, be much higher. Beginning with the standard model of arith: 
metic (and with the standard Godel numbering of sentences), the fixed point is 
reached at a, the first nonrecunsive ordinal. 

Kripke (1984 (1975}) calls a sentence grounded in M iff it belongs to the exten- 
sion or to the antiextension of Tin the least fixed point of ky. The sentences T*Ga’ 
and Vix( Hi» Tx) are examples of the grounded truths of M’ and T"~T"Ga'* and 
Va( Tx—+ Hs) of the grounded falsehoods of M’, The sentences ~Tb (the Liar) and 
Te(the Truth Teller, which says of itself that itis true) are not grounded in M’, The 
truth and falsity of grounded sentences can be traced to some “nonsemantic® facts in 
the ground model. This tracing fails for the Liar and the Truth Teller; hence they 
are ungrounded. The Liar and the Truth Teller are intuitively different ~ the Truth 
‘Teller, unlike the Liar, can coherently be said to be true or to be false ~ and this 
difference is reflected in the fixed points. Kripke defines a sentence to be paradasical 
iff it does not have a classical value in any fixed point. By this definition, the Liar 
'~Tb is paradoxical but the Truth Teller Te is not. In fact, there isa fixed point in 
‘which Teis true and another in which itis false. Note that Ky has exactly three fixed. 
points ~ in one Te is n; in the second, f; and in the third, t ~ and these constitute a 
‘cepo isomorphic to the one pictured in figure 5.1."" 

‘The entire theory sketched above goes through for any three-valued scheme that 
has the monotonicity property. (In fat, it goes through for any scheme, n-valued oF 
another, for which an analog of monotonicity holds.)"" One such scheme is the 
‘supervaluations of Bas van Fraassen (1966) [see also chapter 12] 








A sentence is true (false) in a three-valued model M_by the supervaluation 
scheme iff the sentence is truc (false) in every classical model M, such that 
M= My 
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Observe that Kripke’s definition of paradoxicality implies that the sentence 
Wx~ (Te & ~Ts) is paradoxical. I the supervaluation scheme is used in place of the 
Strong Kleene, then this sentence is assessed not as paradoxical but as grounded. 
It is important to be clear about the intuitive meaning of fixed points. In a fixed: 
point language £, (=(L', M+h, x)), Tis a T-predicate. What this means is that if T 
has the interpretation h then it is a T-predicate of £,: Tis true (false) precisely of the 
truths (falschoods) of £, and so is coextensive with the concept ‘true sentence of 
4 This does not mean that if Thad meant truth — that if it had expresed the 
concept of truth ~ then its interpretation would have been h. (Sometimes a fixed 
point can be the interpretation of Tonly if T does not express the concept of truth.) 
Fixed points do not reveal the interpretation of truth absolutely, They reveal what 
the interpretation of truth would be only under an antecedent hypothesis about the 
interpretation of 7. The fact that « has fixed points does not establish therefore that 
any of the fixed points are the proper interpretation of a predicate that expresses the 
concept of truth (Gupta and Belnap, 1993, section 3A). 

‘The Strong Kleene and the other schemes that allow for the existence of fixed points 
are expressively incomplete. For example, consider the Lukasiewicz. biconditional =, 
which expresses the following function 


For all v, vt, £m): 
(evyetifvey’ 
(v= v')= 1 if exactly one of v and v’ is n, and. 
(=v) =f, otherwise 


{See chapter 14.] The Lukasiewicz biconditional is true, then, ff both its components 
hhave the same value; itis false iff the components have different classical values; and 
it is neither true nor false, otherwise. Any scheme p in which the Lukasiewicz 
biconditional is expressible fails, in general, to allow for the existence of fixed points. 
Consider a ground model M with the following features: 





(i) there is a sentence A that is false in M; 
(ii) there is a name & that denotes (A = Tb) in M; and 
(lil) there is a name ¢ that denotes (Tb Te) in M. 


Suppose, for reductio, that there is a fixed point h for M and p. Then h( 4) =, for 
This Liar-like. Now if h(Te) is classical then (Tbs Te) is neither true nor false and 
hence contradicts our initial hypothesis that h isa fixed point. So h(Té) =n. But this 
implies that (Tb* Te) is true and therefore also contradicts the initial hypothesis. 
‘The reductio is complete: there cannot be a fixed point for p in M. 

In fixed-point languages, the semantic value of a sentence A is identical with the 
semantic value of the truth attribution T“A’. The argument above shows, however, 
that the equivalence of A and T*A” cannot be expressed in the fixed-point languages, 
for the relevant biconditional is inexpressible in them.'? Another example of inexpress- 
ibility is provided by exclusion negation —: 


at=f osn=f and -f=t 
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‘The Liar is assessed as neither true nor false in the fixed-point languages. So it is 
assessed to be untrue but, because of the absence of exclusion negation, this 
assessment is not expressible within the languages themselves. The fixed-point lan- 
guages are weak, then, in their logical resources. They are also bound to be weak in 
the semantic predicates they contain. A Strong Kleene fixed-point language, for 
‘example, cannot in general contain its own ‘neither-truc-nor-false’ predicate.”? The 
situation with three-valued languages is similar, therefore, to the one that obtains for 
classical languages. If a classical language is expressively weakened ~ for example, by 
dispensing with negation ~ then it too can contain its own truth predicate. On the 
fother hand, an expressively rich three-valued language, like an expressively rich 
classical language, cannot contain its own truth predicate 

‘The problem of gaining a semantical understanding of the concept of truth thus 
remains, As was observed above, the concept is used ~ successfully for the most part 
~ in expressively rich languages. But for such languages fixed points do not exist. 
How then is one to make semantic sense of the concept of truth? 


5.3. Revision Theories 
Revision theories make two principal claims: 


i) Logical and semantic sense can be made of circular definitions and concepts."* 
(ii) Truth isa circular concept 





‘These claims are motivated by a striking parallel that obtains between the behavior 
of the concept of truth and the behavior of concepts with circular definitions. The 
concept of truth is plainly unproblematic over a large range of sentences. It applies 
‘unproblematically to 


Snow is white, 
and 

Al theorems of Peano Arithmetic are true. 
and numerous other similar sentences, and it fils to apply to their negations. But 
there are sentences ~ for example, the Liar and the Truth Teller ~ over which the 
concept behaves in a problematic and perplexing way. Now compare this behavior 
of the concept of truth with the behavior of a predicate G that is given a circular 
definition: 

Ge=q x= Socrates] v [x= Plato & Gr] v [x= Aristotle & ~Gx] (52) 


Nonice that definition (5.2), though circular, renders the definiendum G unproblematic 
over a large range of objects. In fact, the definition renders G unproblematic over all 
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objects except Plato and Aristotle: it implies that Socrates is G and the rest are not 
G. Over Plato and Aristotle, G behaves in a problematic and perplexing way. But 
observe that the pathology G exhibits parallels exactly the one that truth exhibits 
over the Truth Teller and the Liar. First, all artempts to determine whether Plato 
and Aristotle are G end up cycling in a Joop, mimicking what happens with the 
Truth Teller and the Liar. Second, just as an arbitrary stipulation about the truth of 
the Truth Teller can be sustained, similarly one can sustain an arbitrary stipulation 
about the Gness of Plato, One can coherently assert that Plato is G and also that 
Plato is not G. Finally, an uncritical use of the rules governing definitions (Defini- 
tendum Introduction and Definiendum Elimination) allows one to go back and forth 
between the claim that Aristotle is Gand the claim that Aristotle is not G — just as 
an uncritical use of the rules governing truth (Truth Introduction and ‘Truth Elimi- 
nation) allows one to go back and forth between the claim that the Liar is true and 
the claim that the Liar is not true. 

More generally, several different kinds of pathologies are exhibited by the concept 
of truth and these, it turns out, coincide with the pathologies exhibited by concepts, 
with circular definitions. This suggests, contrary to traditional ideas, that semantic 
sense can perhaps be made of circular concepts and definitions, and that the perplex- 
ing behavior of the concept of truth may be rooted in a circularity in the concept. 

How can one make semantic sense of predicates with circular definitions? A 
definition ~ say, Gx=y A(x) ~ may be viewed as providing a rule for determining the 
‘extension of its definiendum (i.e, G) in terms of its definiens (ie., A): to determine 
‘whether an object d falls in the extension of G one determines whether d satisfies 
the definiens A(x). The problem is that, with circular definitions, this procedure 
breaks down. For, in circular definitions, the definiendum occurs in the definiens. 
Consequently, information is needed about the extension of G so as to determine 
whether the object satisfies the definiens. Observe though, that despite this diffi 
culty, the procedure is not entirely useless. The procedure can be applied provided 
‘one supplies it with a hypothesis about the extension of G. For example, hypothesize 
the extension of G in definition (5.2) to be @. It is now easily verified that Socrates 
and Aristotle satisfy the definiens. Hence, under the stated hypothesis, the proce: 
dure yields (Socrates, Aristotle} as the extension of G. 

A circular definition does not provide 2 categorical rule for determining the 
extension of its definiendum. It provides instead a hypothetical rule, a rule that 
Yields an extension of the definiendum relative to a hypothesis about the extension. 
‘The central idea of revision theories is that this rule should be viewed as a rule of 
revision: an application of the rule yields a hypothesis that isa better candidate ~ or, 
at least, it is as good a candidate — for the extension of the definiendum as the initial 
hypothesis 

‘The revision rule has a hypothetical character, but it provides a basis for categor- 
ical judgments. The intuitive idea for making the transition to the categorical is this. 
‘One considers all possible hypotheses for the extension of the definiendum, and one 
tries to improve them through repeated applications of the revision rule. Hypotheses 
that survive this process ~ those that are found to occur over and over again in the 
‘course of revision ~ are the ones that are deemed best by the revision rule. Ifa chim 
holds under all the best hypotheses, then itis true categorically. Ifa claim fails under 
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all these hypotheses, then itis false categorically. And if neither of these altematives 
holds ~ that is, if the revision rule yields mo categorical vendict ~ then the claim is 
pathological. 

‘This intuitive idea can be made precise as follows. Let M (=(D, 1)) be a classical 
model for L and let L be extended to L* by the addition of a new one-place 
predicate G. Further, let G be govemed by the (possibly) circular definition D, 


(D) Ge=y Als, @) 


where A(x, G) isa formula that contains no free occurrences of variables other than 
+ Let us call members of the set D+ [t, ] bypotbexes. And, for h a hypothesis, let 
M +h be the model of L* that is just like M except that it assigns to G the interpreta- 
tion h. The rule of revision ny. for in M is an operation on the set D+ [t, f} 
that satisfies the condition 


Boy(h)(d) =e rd satisties A(x, G) in M+h 


[Note that, in genera, the revision rule 8p is not monotone in any interesting sense, 
It cannot be used to iteratively build up an interpretation of G. (In fact, Snax may 
lack fixed points altogether.) To extract categorical information from the revision 
rule, the notion of ‘revision sequence’ is defined. Suppose 5 is a sequence of hy: 
potheses and suppose that the length, th(s), of Sis either a limit ordinal or the class 
(On of all ordinals. For a <th(s), let 5, be the ath member of 5 and let star be 
‘S restricted to 


* An object d is stably (f) in S iff there is an ordinal a < this) such that for all 
ordinals B if a = P< th(s) then Sd) = tf). 
© Sis a revision sequence for By, iff, for all & <th(S), 5 satisfics two conditions: 
(i) ifa=B+1 then S=8n§); 
i) if @ is a limit ordinal then, for all d € D, if d is stably t(f) in sta then 
Sid) =f). 





In a revision sequence 5, then, one begins with an arbitrary initial hypothesis 6. At 
stage 1 — and at successor stages generally — one revises the hypothesis at the 
previous stage by an application of the revision rule. Ata limit stage «, the results of 
earlier revision are summed up: 


fan object is stably tf) up to a then itis declared t(f) at a 
* Otherwise, itis arbitrarily assigned one of the truth values. 


(This treatment of limit stages is due to Belnap.) 

It can now be seen how the system S* of Gupta and Belnap (1993) makes 
semantic sense of circular definitions. A hypothesis his @-reflexine for Bay iff there is 
revision sequence 5 for 8a, such that a <Ih(s) and S,= 5, =; and h is reflexive for 
Bays iff h is -reflexive for some onfinal a> 0. It is easy to show that reflexive 
hypotheses are precisely the ones that occur over and over again in On-long revision 
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sequences ~ they are the ones that survive the revision process. Now a sentence A is 
valid on 2 in Min the system S* iff, for all hypotheses h that are reflexive for By, 
Ais true in M+h; and -D implies A in S* iff for all classical models M of L, Ais 
valid on D in M in S*."* An argument A,,..., A,:. Bis valid on D in S* iff D 
implies ((A, & --- 8A.) -> B) in S* 

Example. Consider definition (5.2) and suppose M represents the actual state of 
alfirs ~ 50 that Socrates, Pato, and Aristotle are distinct individuals. Then the revision 
rule has four reflexive hypotheses: (Socrates}, [Socrates Plato}, {Socrates, Aristotle}, 
and {Socrates, Plato, Aristotl]."* Of the following sentences, 





G(PIato) 
G( Aristotle) 
GiSocrates) 
\Vx( Gx — x= Socrates v x= Plato v x= Aristode) 


the last two are valid ~ they are true under all reflexive hypotheses ~ but the fist two 
are not. They are patholagical: true under some reflexive bypotheses and false under 
others. Different kinds of pathologicality can be distinguished; the details are omitted 
here, so refer to Gupta and Belnap (1993, section $D.15). 

‘The traditional theory imposes two requirements on definitions: noncreativity and 
climinability. Noncreativity requires that the addition of a definition should not 
create essentially new validities. Under the $* semantics, all definitions, ordinary as 
well as circular, meet this requirement: ifa sentence A of Lis deemed valid in M by 
the $* semantics then A must be true in M. Eliminability, on the other hand, 
requires that occurrences of the definiendum must be climinable: for every sentence 
A, the definition should imply the equivalence of A with a sentence B that contains 
‘no occurrences of the definiendum."” The $* semantics does not mect this require- 
ment. Indeed the requirement cannot be met by any theory that aims to accom- 
‘modate all circular definitions. One important intuition underlying climinabilty is, 
however, respected in $*: a definition fixes completely the meaning of its definiendum. 
Note also that S* respects eliminability for noncircular definitions. More generally, 
with noncircular definitions, S* leaves intact our ordinary preconceptions about 
definitions and also our ordinary ways of working with them. Only on the domain of 
the circular does $* dictate some modification of our ideas and practices. 

‘A remark about the expressive power of circular definitions under $*: A construc- 
tion due to Kremer (1993) establishes that inductive and co-inductive sets are 
definable using first-order cicular definitions. Expressive power goes hand in hand 
‘with complexity. Kremer (1993) and Antonelli (1994) have shown that there are 
finite sets of definitions whose implications in $* constitute a TH} set. 

‘Tarski wrote of the T-biconditional that it “may be considered a partial definition 
of truth, which explains wherein the truth of... one individual sentence consists” 
(1944, section 4). Notice that Tarski’s suggestion can be accepted only if one is 
prepared to countenance cireular partial definitions. For the definiens supplied by 
some T-biconditionals contain the term “true.” Example: 
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‘Everything Jones says is true’ is true = everything Jones says is truc. 


With the theory of definitions S* one can now implement Tarsk's suggestion. 
Suppose, as before, that the one-place predicate T is added to £ (= (L, M, 0 
M-=(D, D) that the sentences of Z are in the domain D, and that £ has quotational 
names of all the sentences of L. Suppose further that Tis governed by the Tarski 
biconditionals ~ TA’ =, A - and the principle that only sentences are true. Suppose, 
that is, that T'is governed by the following infinitstic circular definition." 


Tomy (= 





8 Ay) v (3= tA & A) vo (ee AL A) 


‘This definition yields a rule of revision ty for Tin M. The rule can be characterized 
thas: for all hypotheses h and all d€ D, 


‘tulh)(d) = 0d is a sentence that is truc in M+h 


Example: Consider the model M’ from the previous section. Let h be the hypothesis 
under which the extension of Tis @ ~ so, b(d) =f, for all d € D. The first three 
applications of the revision rule ty, to h yield three interpretations. 


(i) (Ga, ~T°Ga’, ~T'~Ga’, We~Ts, ~Te, ~T...| = Tye) 
Gi) (Ga, T°Ga’, T°~T'Ga™, WaT, ~Te, Th.) Tyee) 
(il) (Ga, T°Ga?, T°T+Ga", PT ~T'Ga""", ~Te, ~Th, .-.) = ete e(h))) 





[Note that, in the coune of revision, sentences move both in and out of the exten- 
sion of T. 

‘The revision rule ty has the following remarkable property. In a large class of 
models - models that can roughly be characterized as those without vicious refer- 
ence ~ repeated applications of the revision rule to a hypothesis culminate in a fixed 
point, and furthermore, no matter where one begins the process of revision, it leads 
to the same fixed point. In these models, the $* semantics dictates that truth has a 
classical interpretation and that the usual preconceptions about it hold.” 

‘The presence of vicious reference destroys the ideal of convergence to a unique 
fixed point. A Truth Teller destroys convergence: it makes the outcome of revision 
dependent on the initial hypothesis. In the example above the Truth Teller Te stays 
out of the extension of T at successive stages of revision, reflecting the initial 
hypothesis that it is untrue. If the initial hypothesis were that itis true, then the 
‘Truth Teller would have stayed in the extension of Tat all successive stages. A Liar 
sentence, on the other hand, destroys the stability of revision. Note that the Liar 
~T? fips in and out throughout the revision process. Consequently, the process 
does not settle down; at every stage of revision, the outcome is revised to a different 
Even in the presence of vicious reference, many sentences receive a categorical 
assessment in the revision process. These sentences eventually settle either in the 
extension or in the antiextension of truth. Furthermore, they settle the same way 
irrespective of the initial hypothesis of revision. Note that the grounded truths of the 
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Strong Kleene scheme, and of the supervaluation scheme, are valid - they invariably 
settle in the extension. (There are valid sentences, however, that are not grounded 
(on either scheme.) The theorems of classical logic are also valid. Furthermore, the 
‘material biconditionals, T*A’ € A are valid for nonpathological sentences A, but 
they may fail to be valid for pathological sentences. 

‘There is thus a fundamental distinction between the definitional equivalences 
TVA’ =y A and the material equivalences T*A’ ¢+ A. The former are acceptable, but 
the latter are not. The distinction provides 2 natural diagnosis of the fallacy in the 
Liar argument: equivocation. When the T-biconditionals are read definitionally, they 
are acceptable but they fail to imply any contradictions. (No definition implies 
contradictions in $*.) When they are read materially, the T-biconditionals imply 
contradictions but they are unacceptable 

‘The foregoing gives a sketch of some of the shared features of revision theories. 
Now, turning to differences: The main difference centers on how revision theories 
‘extract categorical information from the revision rule and, in particular, how they 
‘treat limit stages. Belnap’s limit rule, on which the theory $* is based, isthe simplest 
and the most liberal. It allows arbitrary choice with respect to unstable elements, 
‘The other limit rules impose constraints on how choices are made, Herzberger's 
(1984 (1982]) limit rule is the strictest: it requires the unstable elements 10 be 
removed from the extension. Yagib (1993) has argued against Belnap’s and 
Herzberger’s rules (and also against another that I had suggested) and has proposed 
an intricate one of his own. Chapuis (1996) has studied a treatment of limit stages 
that results in “fully-varied” revision sequences. This treatment seems to me to be 
‘especially promising. 

‘The choices made at limit stages can contaminate the revision process, though the 
cffects of contamination are reduced by subsequent applications of the revision rule, 
Tis natural, therefore, to ignore a finite number of stages near limits when assessing 
stability. If this is done, one obtains the system S* of Gupta and Belnap (1993); for 
a more precise account, see Gupta and Belnap (1993, section 5D), It is important to 
draw attention to one substantial and important difference between S* and S*. The 
theory of truth based on S* does not imply the semantical laws ~ laws such as, 


~Alis true iff A is not true 
A conjunction is true iff its conjunets are true. 
and 


If every object has a name, then 2 universal quantification is true iff all its instances 
are true 


‘The semantic law of negation, for instance, fails to be valid in $* because itis false 
at all limit stages at which both the Liar and its negation are declared to have the 
value t. The theory of truth based on S*, on the other hand, does validate the 
semantic laws. For the failure of the laws occurs only at limit stages, which can be 
neglected in S" in assessments of validity. This advantage of S*is intimately connected, 
however, to a feature that some critics regard as a grave flaw: on the S* theory some 
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definitions imply inconsistencies. (S* respects the requirement of noncreatvity, so 
no definition implies outright contradictions. However, there are definitions that, 
for each object d, validate the sentence ~Ad-d a name of d ~ and validate also the 
sentence 4xAx.) McGee (1991, pp. 29-31) has shown that, under minimal con 
ditions, the semantic laws imply «inconsistencies. So one cannot gain the semantic 
laws without countenancing @-inconsistent definitions. I myself am attracted to the 
system S* and am willing to entertain sriously the idea that there are @-inconsistent 
definitions and that the Tarskian definition of truth is one such2® 


5.4. Contextual Theories 


According to contextual theories, the interpretation of ‘true’ depends on context. 
‘The precise account of the dependence varies considerably from theory to theory, 
however. Parsons (1984 [1974]) was the first in the contemporary debate to argue 
for contextual theories. He suggested that the relevant contextual contribution is a 
‘scheme of interpretation,” a scheme that specifies the ranges of quantifiers, the inter- 
pretation of indirect-discourse terms such as say’, ete, Burge (1984 [1979]) defined 
hierarchies of truth predicates. The contribution of context in his theory is the level 
at which ‘true’ is interpreted. Barwise and Etchemendy (1987) proposed what they 
called an Austinian theory in which the relevant contextual contribution is the 
portion of the world that a proposition is about. (They also presented an interesting 
modeling of circular propositions within Aczel’s non-well-founded set theory.) 
‘The theories of Koons (1992) and Simmons (1993) are closely related t0 those of 
Parsons and Burge. Relevant contextual factors in their theories include intentions 
Of speakers; these help to determine something like a scheme of interpretation. A 
distinguishing mark of Simmons's theory is its avoidance of hierarchies. The theories 
of Skyrms (1984) and Gaifman (1992) can also be seen as falling in the contextual 
category, These authors look at the network generated by the process of semantic 
evaluation, and the interpretation of “true” in an utterance depends on the place of 
the utterance in this network. (See the discussion of the Chrysippus intuition below 
for a simple example.) 

‘Three ideas ~ Universality, the Serengtivened Liar, and the Chrysippus intuition 
(though these are not always clearly distinguished) ~ have played a dominant role in 
the motivation of the modern contextual theories. These theories are net motivated, 
it should be stressed, by the ordinary unproblematic uses of ‘true’. Before the work 
of Kripke and others, there was the idea that contextual elements are also needed for 
the interpretation of ‘true’ in unproblematic sentences. For example, if ‘true’ in 

Nothing X said at ris true. 
is interpreted as ‘true,’ in a Tarskian hierarchy, then the level has to depend on 
such contextual factors as what X actually said at ¢. If X did not use the truth 


predicate (and other similar devices), then m can be 1; otherwise, m has to be higher. 
But the modem theories avoid contextual shifts in the interpretation of unproblematic 
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sentences. (They do s0 by piggybacking on a type-ffee theory, €-g., a fixed-point 
theory.) Context comes into play in the modern theories only to account for certain 
ideas and intuitions connected with the paradoxes. The next subsections examine 
the more important of these ideas and intuitions. 


S41. Universality 


‘Tarski expressed the view that the source of the paradoxes is the universality of 
natura languages. He wrote (Tarski, 1969, p. 67) 


[We have to analyze those features ofthe common language that are the real source of 
the antinomy of the liar. When carrying through this analysis, we notice at once an 
‘outstanding feature of this language ~ its allcompechensive, cniversl character. The 
common language... is supposed to provide adequate facilities for expressing every- 
thing that can be expressed at all, in any language whatsoever. 


Several authors have accepted Tarski's diagnosis that the real source of the paradoxes 
is the universality of natural languages?” and they have treated universality as a 
desideratum for any adequate theory of truth. In practice, the desideratum is 
‘weakened to the following: the theory must show how semantic self-sufficiency is 
possible. That is, the theory must show how there can be languages £ such that a 
complete semantic theory for £ is expressible in £ itself. This desideratum plays 
a large role in, for example, Simmons’s work. Simmons (1993) uses it to argve 
against fixed-point and revision theories, and he aims to satisfy it through his own 
contextual theory. 

"The ideas of universality and semantic self-sufficiency need philosophical clarifica- 
tion and justification before they can be accepted as providing desiderata for a 
theory of truth. Ordinary English ~ to take the most immediate example of a natural 
language ~ is plainly not universal: it does not have the resources to express every: 
thing. It is true that English is highly flexible ~ its expressive power can be extended 
indefinitely (€.g., through the addition of new vocabulary). And it may be that, for 
everything that can be expressed in any language, there is an extension of English 
that expresses it But this provides no reason to believe that English has extensions 
that are semantically self-sufficient. The large literature on the subject has yet to 
provide a good argument for the claim that semantically self-sufficient languages are 
possible.” 

‘There is a related, but much weaker, thesis that is a plausible desideratum for 
theories of truth: a natural language can express — cither as it is or through an 
extension ~ any semantic concept whatsoever. This thesis (call it ‘weak universality) 
{implies that the status of pathological sentences such as the Liar can be expressed in 
natural languages (or their extensions). So, ifa theory declares the Liar to be neither 
truc nor false (oF not categorical or not true in a context ¢), then weak universality 
requires that the theory show how a language can express not only its own concept 
Of truth but also its own concept of neither-true-nor-false (or ‘not categorical” or 
‘not trie in a context ¢) 
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542. The strengthened Liar 
‘An cxample of the strengthened Liar is the following statement SI. 
(SL) Either SL is neither-true-nor-false or it is not true. 


If SLis either true or false, then SL reduces to the Liar and is paradoxical. Further, 
SSL cannot be assessed to be neither true nor false, for that implies that SL is true 
(because the first disjunct of SL is true). SL thas raises a serious problem for any 
theory that assesses the paradoxes to be neither true nor false 

‘The strengthened Liar plays a significant role in Parsons’s (1984 [1974]) and 
Burge’s (1984 (1979]) contextual theories. They rely on it in their objections 
against the truth-value-gap theories; and they use it to lend plausibility to the idea 
that the Liar statement (and ao SL) is in one sense untrue. This point is closely 
connected with the Chrysippus intuition and is discussed below. 

“The strengthened Liar is a problem for any theory that attempts to meet weak 
universality. Let a theory declare the Liar to be K (where K is an arbitrary semantic 
concept). A new strengthened version of the Liar can be formed by mimicking, the 
carlier one: 


(KL) “This very statement is K oF else it is nor true’. 


[As before, one cannot assess KIL as K, for this implies that KL is true; furthermore, 
fone cannot assess KIL as not K, for then KL reduces to the Liar and is paradoxical. 
So the semantic character of KIL lics beyond the reach of the concept K. More 
generally, the addition of semantic concepts to a language can result in new para 
doxes that cannot be adequately described using those very semantic concepts, To 
meet weak universality, a theory of truth must give an account of the paradoxes that 
arise not only from the concepts that are the ebject of investigation but also those 
that arse from the concepts that the theory dmrakes; see Gupta and Belnap (1993, 
GE and pp. 253-9) 


543. The Chrpsippus intuition 
Suppose Zeno says at time 4, 

(2) What Zeno says at tis not truc. 

CChrysippus overhears Zeno, reflects on his remark, and finds that it has “no meaning 
at all” and therefore rejects is (Bochefski, 1961, p. 133). Chrysippus reports his 
assessment by saying 

(©) What Zeno says at ris not true. 


Zeno and Chrysippus use the same sentence in making their claims. But, according 
ro the Chrysippus intuition, Zeno’s statement is to be assessed as pathological and 
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CChrysippus’s statement is to be assessed as true. This intuition is shared, as far as T 
know, by all advocates of contextual theories. If the intuition is correct, then the 
argument for the contextual theories is straightforward: Zeno and Chrysippus use 
the same sentence to make different statements. Context must therefore make a 
contribution to what they say, and this contribution can plausibly be located at only 
‘one place, namely, the interpretation of ‘true’ 

According to Skyems (1984) and Gaifman (1992), the important difference be- 
tween the statements of Zeno and Chrysippus is their location in the network of 
semantic evaluation. To evaluate Zeno’s statement Z, one has to find the denotation 
Cf ‘what Zeno says at F and evaluate it for truth, But the denotation is Z itself, So 
‘one is forced to repeat the process, and is caught in an unending loop. Evaluating, 
CChrysippus’s statement C, leads to evaluating not C but Z, which then results in a 
oop. There is an important difference between Zand C, according to Skyrms and 
Gaifman: Z is caught up in a pathological loop, but C stands outside this loop and 
can therefore semantically assess Z.. (See figure 5.2.) In general, complex networks 
Cf semantic evaluation may contain pathological loops with many interdependent 
statements. Statements that stand outside these loops can give, according to Skyrms 
and Gaifman, true semantic assessments of one or more members in the loop. See 
Gaifiman (1992) for a particularly clear and elegant development of the idea. 


‘e<— eo 


© 


‘The contextual theories are primarily motivated, as noted above, by ideas and 
intuitions abour the paradoxes. Nevertheless, they have important consequences for 
‘ordinary practices. Suppose that a trusted friend and a political insider informs you: 


Everything Senator X will say tonight is untrue 


Figure 5.2 


You may have all the good reasons to believe him, but on the contextual theories 
your reiteration of the friend’s claim may be logically illegitimate: the interpretation 
Of ‘true’ in your reiteration might be different from that in your friend’s utterance. 
For example, itis possible that your reiteration lies in a pathological loop, whereas 
your friend’s urterance lies outside this loop. So, it is possible that your friend's 
utterance is true but your reiteration is nonetheless pathological and hence untrue. 
‘The Chrysippus intuition is often closely associated with the thesis of weak univer- 
salty, bur the two are quite different. The Chrysippus intuition embodies a descriptive 
claim about the uses of ‘true’ — namely, that the pathological sentences can correctly 
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be described as “untrue’. The thesis of weak universality, on the other hand, makes 
‘no particular claims about the proper vocabulary for the assessment of pathological 
sentences, This thesis is immune, therefore, from an entirely natural doubt to which 
the Chrysippus intuition is subject: if our final assessment is that the Liar is untrue, 
then should that not lead one to say that the Liar is true afterall (because the Liar 
says of itself that it is not true)? 

‘The Chrysippus intuition is a theoretical intuition, It arises from theoretical reflec: 
tion on the paradoxes and is tied to one specific type of theoretical response, If the 
response is, for example, that the Liar statement is meaningless or that it is neither 
true nor false, then one is led to say that the Liar statement is untrue, Weak univer- 
sality now dictates that this assessment should be expressible in natural languages, 
and one arrives at the Chrysippus intuition. The theoretical response is not forced, 
however. An alternative response isto judge the Liar to be pathological and to judge 
the truth-attributions to the Liar ~ that the Liar is true and that the Liar is neither 
true nor false ~ to be pathological also, Weak universality now requires the theory to 
show how the assessment that the Liar is pathological can itself be expressed in a 
natural language, This is a significant demand, but it is different from the one 
imposed by the Chrysippus intuition 


‘Suggested further reading 


‘Tamsi's essay (1944) and the emays in Martin (1984) are a good place to begin one’s study 
of paradox. After thatthe reader should turn to Gupta and Relnap (1993), McGee (1991), 
and Barwise and Etchemendy (1987) 
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Notes 


1 See Bochefiski (1961) and Spade (1988). The paradox was also discussed in Indian logic 
and semantics, notably by Bharthan (seventh century CE); see Houben (1998). 

Note: A more detailed treatment of many topics discussed in this esay can be found in 
Belnap's and my book The Revision Theory af Truth (1993). 

2 Bar-Hillel calls propositions ‘statements.’ See the bibliography of (Gupta and Belnap, 
1993) for references to Bar-Hille!’s work. The reader should consult this bibliography 
for references to works alluded to below that, for reasons of space, are not included in 
the lis of References. 

3° [will often suppress the relativity of truth to language. 

4 Priest (1987) tres to deal with the problem by weakening clasical logic 
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Historical note: There was substantial work on thre valued approaches to the Liar in the 
late 1960s and the early 1970s. Bur much ofthis work was concemed with questions di: 
ferent fom the ove Martin, Woodruf, and Kripke setied. One concer in this work was 
to provide a rationale forthe idea that paradoncal statements sufer from truhvalve gaps 
‘Another concem was with ceaia simple bur powerful objections (eg, the strengthened 
Liar) to the trut-value gap idea. Three-valued approaches were also defended i the 
medieval teratre on the Liar. And Bochvar's work on the subject dates from 1937. ~ 
‘The true antcipatios of Marin, Wood, and Kripke occurred in paral se theory, 
in the work of Fitch and Gilmore. Gilmore proves and applics « fixed-point theorem to 
4 problem hat purlels the Lar; see Fferman (1984, section 14), for more historical 
information. Kripke’s enay stands out, however, for is philosophical clarity and force. 
‘Chapuis and Gupta (1999) contains some recent essays on truth theories and thee 
applications 

Here and below, the symbol “= is used to designate several diferent orderings. The 
‘context will always make the intended reference of =" ear. 

1 shall often omit the parenthetical clause 

Recall that, by our ear definition, (2, @) is the function that ass to each 
member of the domain. 

For more information about the fixed points of Xy, see Burgess (1986) and Visser 
(1989). As Kripke briefly notes, the last fined point has important connections with the 
{heory of inductive definitions (sce Kripke (1984 (1975]) and McGee (1991)), 

See Visser (1984) and Woodruf (1984) for an application to 2 four-valued semantics 
and McGee (1991) for an application to what he eal "partly interpreted languages’ t 
«an be shown that fixed points exist in certain aonmonotonic schemes also; se Gupta 
and Belnap (1998, section 2). 

[Note that the equivalence cannot be expresied using the biconiional +, because 
(avon) 

‘This observation was first made by John Hawthoen (1983). Note that « Weak Kleene 
language can contain both ts truth and ts “neither troe-nor false" predicate (Gupta and 
‘Martin, 1984). Like other three-valued languages, however, it cannot contain a predicate 
that expresses “untrue 

“There isa more general version of this thesis: logical and semantic vense can be made of 
systems of mutualy interdependent definitions and of mutually interdependent concepts. 
For the sake of notational simplicity, f discum only the narrower thesis and here 100 1 
restret myself to definitions of one place predicates 

Gupta and Beloap (1993) uses “Ais valid on in $1 
Ain $*. 

Recall that extensions 6x classical interpretations. It will sometimes prove useful © 
‘enti an iterpretaton with the coresponding extension. 

See Relnap (1993) for an dluminating discusion of the two requirements 

By accepting this definition we are not forced to view the T-bcooditioals a fixing the 
seaning or the ase of tue’. One may view them as fixing only the intension of true’ 
(Gupta and Belnap, 1993, pp. 20-29) 

See Gupta and Belnap (1993, sections 6A and 6B). Kremer (2001, forthcoming) has 
recently contributed 4 valuable study of the phenomenon. Kremer's paper contains 
‘negative solutions for problems 6B.12 and 6B.15 of Gupta and Belnap (1993). 
Koons argues agunst Sin his critical esay (Koons, 1994). Further criticisms of revision 
theories may be found in Simmons (1993), McGce (1997), and Martin (1997). 
“The diagnosis is questionable, however, because the paradoxes occur in languages that 
a far fom universal. 
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22 Note well the onder of the quantifiers here. Sometimes Tarski means university i 
weaker seme. The above quote from Tarski continues, "[the common langage) is 
continually expanding to satisfy this requirement [ic universality!” (1969, p67). 

23. The construction of a semantically self-sufficient language is more likely, 1 believe, if 
the syntactic resources of Care judiously restricted; se Gupta (1984 | 1982}, secon It). 
For further discussion of univenalty and semantic self-sufficiency see McGee's, D. A. 
Martins, and my papers in Villanueva (1997). 

24 The argument is raightforward but aot indubitable. Perhaps che contextual contsbution 
{so shift the interpretation of Snot.” Perhaps it is to invoke 2 new sense of “rue 
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Chapter 6 


Logical Consequence 


Patricia A. Blanchette 


6.1. Introduction 


Whenever one asserts a claim of any kind, one engages in a commitment not just to 
that claim itself, but to a variety of other claims that follow in its wake, claims that, 
as we tend to say, follow legically from the original claim. To say that Smith and 
Jones are both great basketball players is to say something from which it follows that 
Smith is a great basketball player, that someone is a great basketball player, that 
there is something at which Smith is great, and 50 on. 

‘This general fact, that certain claims follow logically from others, is the central 
concer of a theory of logical consequence. Logical consequence is just the relation 
that connects a given claim of set of claims with those things that follow logically 
from it; to say that B is a logical consequence of A is simply to say that B follows 
logically from A. All of ordinary reasoning turns on the recognition of this relation, 
When one notices, for example, that a certain prediction follows from a given 
theory, that a particular view is a consequence of some initial commitments, that a 
collection of premises entails a given conclusion, and so on, one is engaged in 
reasoning about logical consequence. 

‘The other logical properties and relations whose recognition is central to ordinary 
reasoning are closely related to, and can be defined in terms of, logical consequence: 


‘© An argument is pati if (if and only if) its conclusion isa logical consequence of 
its premises. 

A set of claims P entails a claim «iff «is 2 logical consequence of T. 

A set of claims I” is consistent iff no contradiction is a logical consequence of it. 

AA claim ais independent of a set of claims Tiff is not a logical consequence of I. 

‘A claim ais a lagical truth iff it is a logical consequence of the empty set of 

claims. 


‘The investigation of logical consequence and related notions consists largely in 
the attempe 
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(a)_ to give a systematic treatment of the extension of this relation, ic, of the issue 
(of which claims do in fact follow logically from which others; and 
(b)_ to give an informative account of the nature of the relation. 


‘There is much room for debate about both of these issues. Though there is no 
doubt about the fact that some claims do in fact follow logically from others, 
and (perhaps more obviously) that some sets of beliefs are inconsistent, that some 
arguments are definitely not valid, and so on, there is room for disagreement about 
cases. Philosophers have disagreed, for example, about whether certain purely math- 
ematical claims are logical consequences of apparently non-mathematical claims. 
They have disagreed about whether existential claims about properties follow logi 
cally from ordinary predications. And so on. 

To give a determinate answer to each and every question of the form ‘Does this 
follow logically from that?” will require, among other things, a decision about the 
precise boundaries separating logical consequence from set-theoretic or mathemat- 
ical fact, and a decision about the metaphysical commitments of various kinds of 
claims. ‘The project of clarifying the extension of the logical consequence relation 
thus tums, to some extent, on issues outside the scope of the philosophy of logic 
proper, on issues, for example, that fall within the camp of pure metaphysics, phi 
losophy of mathematics, and related fields. But some of the central questions about 
the extent of the relation depend for their answers on the second of the two topics 
noted above, namely on the issue of the nature of logical consequence. Reflection 
‘on some clear, easily recognized cases of logical consequence, such as that 


Socrates is mortal. 

follows logically from the pair of claims 
All humans are mortal 

and 


Socrates is human, 





reveals some straightforward necessary conditions for logical consequence. It 
is uncontroversial, for example, that the relation of logical consequence is truth- 
reserving, i., that the logical consequences of true claims must themselves be true, 
But things become more difficult, and more contentious, when one tries to fill in 
more details. Some of the disagreements here tur, for example, on questions about 
whether the conclusion of a valid argument must, in some sense, be ‘about’ the 
same subject-matter as are its premises, and about whether there is a clear sense in 
which the fundamental principles of logic must hold independently of any particular 
subject-matter, to be, a8 it is often put, topic-neutral. Further disagreements arise 
about whether, and in what sense, logical truths must always be necessary truths, and 
s© on. Different answers to these questions will deliver different views about the 
precise extension of the logical consequence relation, and will consequently give rise 
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to different systematic treatments of consequence in the form of formal systems of 
logic. 

‘The purpose of this chapter is to provide a brief introduction to the central issues 
surrounding the nature and the extension of logical consequence, and to the role of 
formal systems in the investigation of consequence, 


6.2. Early Formal Systems and Accuracy 


A formal gatem of logic consists of a rigorously specified formal language (i.c., 
a collection of formulas defined solely in terms of their syntax) together with a 
deductive stem (ic, a specification of those series of formulas that will count as 
deductions of particular formulas from collections of formulas). Since the end of the 
nineteenth century, formal systems have been widely used as means of codifying, and 
analyzing the relation of logical consequence and its associated notions, 

‘The caliest formal systems, ¢.g., Frege's, were intended to give a way of rigor: 
‘ously demonstrating relations of logical consequence ~ ic., of demonstrating of 
particular claims that they were indeed logical consequences of particular sets of 
claims. The intention in designing the system was, to put it somewhat loosely, that 
the system would include a deduction of a formula @ from a set of formulas only 
if @ was indeed a logical consequence of E. Deducibilty within the system was to 
have been a reliable indicator of logical consequence. 

‘The ‘somewhat loose’ character of the above description is due to the fact that the 
formulas themselves ~ ic, strings of marks on paper ~ are not, infact, the items that 
bear the logical consequence relation to one another. Thus one cannot, strictly 
speaking, describe the goal of system-design as one of making sure the deducibility 
relation is included in the logical consequence relation. As Frege saw it, the items 
that bear logical relations to one another are nonlinguistic propositions, the kinds of 
things that are expressed by fully interpreted sentences, and that are the objects of 
the propositional attitudes. And, as Frege saw it, his formulas as they appeared in 
deductions always expressed determinate propositions. Thus the Fregean goal for an 
adequate formal system can be described, now accurately, as 


a formula @ is to be deducible from a set £ of formulas only if the proposition 
expressed by @ is a logical consequence of the propositions expressed by the 
members of E. 


Sce Frege (1964 [1893], Vol. 1), esp. the Introxtuction and sections 14, 15, 18, 20. 
It is, in principle at least, 2 straightforward matter to check whether a system 
satisfies this requirement. Since the deducibility relation is typically defined in the 
familiar way in terms of axioms and rules of inference, the quality-control check is 
{in principle, at least) simple: one checks to see that every proposition expressible by 
an axiom is a truth of logic, and that each rule of inference countenances the 
deduction ofa formula o from a n-tuple of formulas By .. . B, only if the proposition 
expressed by a is a logical consequence of those expressed by By... B. 
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Frege himself did not offer any uniform method for carrying out this “checkin 
i.e., for demonstrating that a given proposition is infact, a truth of logic, or that a 
given proposition does, in fact, follow logically from a particular collection of prop- 
sitions. He seems to presume that the very simplest cases of logical truth and of 
logically valid inference are obvious when encountered. The accuracy of the formal 
system was to have been established by simply pointing out that the propositions 
expressed by axioms were indeed obvious truths of logic, and that the rules of 
inference obviously generated only logical consequences from premises. The impor- 
tance of the formal system was, in large par, that once an audience had granted the 
handful of (ostensibly) immediately obvious logical principles required for axioms 
and rules of inference, it was a straightforward matter to demonstrate the validity of 
considerably more complicated and non-obvious arguments. The accuracy of the 
system as a whole rested simply on the logical status of the axioms and rules of 
inference, and the validity of extremely complicated arguments was guaranteed by 
the accuracy of the system. 

‘As it turned out, Frege's favored formal system (1964 [1893]) was not, in fact, 
accurate. His deducibility relation contained a subtie but important flaw, with the 
result that the system contains deductions of both a formula @ and its negation ~ 
from the empty set of premises. And since the propositions expressed by such pairs 
of formulas cannot both be logical truths, Frege’s deducibilty relation cannot be 
considered a reliable indicator of logical consequence. Some of the fundamental 
claims that Frege took to be ‘obvious’ logical truths were in fact falsehoods.! 

Successors to Frege's system, of course, avoid this particular error, and a number 
Cf them offer presumably accurate indications of logical consequence, Before tum: 
ing to a discussion of contemporary formal systems, two features of Frege's 
approach to formal systems are worth nothing. 

‘The first is the Fregean view of the bearers of the logical relations, Of central 
‘concem in the philosophy of logic, the question is this: what kinds of things, exactly, 
are the logical truths, the relata of the logical consequence relation, the components 
of valid arguments, and so on? Frege’s answer is, a8 noted above, that these bearers 
Of logical relations are a particular kind of abstract object, namely, nonlinguistic 
‘propositions. The attraction of this view stems from the fact that these propositions 
are also, as Frege and most proposition-theorists see it, both the semantic values of 
four utterances and the objects of our propositional attitudes. The combination 
Of these views gives an easy explanation of the fact that not only our assertions, but 
also our beliefs, the contents of our hopes and fears, and so on, can have logical 
implications. The view also helps to make sense of the apparent logical connections 
between these different kinds of entities; the very thing that forms the content of 
‘one person’s desire can logically contradict the content of another's assertion and of 
yet a third person's belief, and the straightforward explanation of this, on the 
propositional view, is that the desire, the assertion, and the belief all have proposi- 
tions as their contents. But the view is not without problems. The central difficulty 
with the view of nonlinguistic propositions as the bearers of the logical relations is 
that, arguably, itis doubtful that there are such things as nonlinguistic propositions, 
Reasons for doubting the existence of propositions stem primarily from the difficulty 
Of giving clear criteria of individuation for propositions, from general worries about 
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abstract objects, and from considerations of ontological parsimony. See Quine (19536 
[1948}, 1970, ch. 1) and Cartwright (1987¢ [1962]}. 

‘Skepticism about nonlinguistic propositions leads to the alternative view that 
sentences are the bearers of the logical relations. Some caution is required here, 
however, about precisely what is meant by the claim that sentences are logical truths, 
are logical consequences of one another, and so on. Taking a sentence to be simply 
1 scries of marks or sounds, the view that the logical relations are borne by sentences, 
is untenable. The string of symbols 


All men are mortal. 
does not, by itself, have any logical implications, any more than does the string, 
Ait 


‘Though itis tempting to view the fist string as having a rich collection of logical 
implications, itis important to note that this temptation is felt only when we take 
the sentence to be not merely a string of shapes on paper, but rather to be some: 
thing with a determinate meaning. A bare series of symbols does not have any 
logical properties at all, though a string of symbols together with the right kind of 
semantic value certainly does. The view that the logical relations obtain between 
sentences, then, is only a reasonable view if by ‘sentence’ one means something like 
‘series of symbols together with a determinate meaning.” The view, in short, is that 
the bearers of the logical relations are meaningful sentences, 

“The two views in question (that nonlinguistic propositions are the primary bearers 
Of the logical relations, and that sentences together with their meanings are the 
primary bearers of the logical relations) are not importantly different if, as is some- 
times done, one takes the meanings of sentences just to be nonlinguistic proposi- 
tions. Bur if one takes the meaning of a sentence to be something non-propositional, 
for example, a pattern of use in a given population, then the two views are impor: 
tantly different, with the first but not the second committed to the existence of 
something like Fregean propositions. The latter understanding of ‘meaning’ is of 
‘course required by those whose view is motivated by skepticism about the existence 
Of nonlinguistic propositions. 

Despite the intrinsic interest of this issue, the difference between the two views 
(propositional and sentential) of the bearers of the logical properties and relations will 
not be terribly important in what follows, and it is not necessary here to adjudicate 
between them. The important point about both views is that they are fundamentally 
‘semantic’ in the sense that they construc the logical relations as obtaining either 
between meanings themselves (propositions), or between pairs of syntactic items and 
‘meanings. And this is as it should be; as noted above, the logical relations do not 
obtain between bare syntactic items, but only between items which make some 
determinate claim on the world; which are, in brief, meaningfal. In what follows, 
the relata ofthe logical relations will simply be referred to as claims, taking the word 
to be ambiguous between nonlinguistic propositions and meaningful sentences. 
Except where noted, everything said below applies on either reading. 
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‘The second feature worth noting here about the conception of logic underlying 
Frege’s and similar formal work concems the distinction between the pretheoretic 
relation of logical consequence and the various relations of formal deducibility given 
by particular formal systems. The relation of logical consequence is pretheoretic in 
the sense that neither the relation itself, nor our recognition of the relation, depends 
upon the existence or the deliverances of formal systems. Similarly for the related 
notions of logical truth, consequence, consistency, and so on, When one infers 


Katy is wise, 
from the pair of claims 

All of John’s children are wise 
and 

Katy is a child of John’s 


Cone recognizes a connection between these claims that would have beld whether or 
not anyone had ever invented formal systems, and whether or not any of those 
systems had pronounced the inference valid. Similarly for the ordinary notions of 
inconsistency, validity, entailment, and so on that are recognized in everyday reason 
ing. These logical properties and relations link assertions, beliefs, and theories one to 
another in ways that do not depend upon the results of work in formal logic, The 
dependence is, rather, the other ways round; standard systems of formal logic will 
count the argument just noted (oF a formalized version of it) as valid because those 
systems are designed to reflect the pretheoretic logical properties and relations accu- 
rately. It is only with respect to this sense of a system-independent notion of logical 
consequence that one can make sense of the idea of a formal system's being accurate 
for inaccurate, since the accuracy of the system is a matter of the extent to which 
its relation of formal deducibilty reliably indicates the pretheoretic relation of logical 
consequence. And, of course, it is only against the background of such a system 
independent notion of logical consequence that one can agree with Frege’s later 
assessment of his own formal system as inaccurate in the way noted above, 


6.3. Contemporary Formal Systems 


Contemporary formal systems differ from Frege’s in two ways that are relevant to 
the issue of their accurate reflection of the logical properties and relations. First of 
all, one does not, these days, typically view the formulas that occur in deductions as 
cach expressing unique claims. Each formula is typically thought, rather, to be capable 
of expressing a broad range of claims. The guidelines governing precisely which 
claims each formula can appropriately express are seldom made explicit; they are 
simply the rules of thumb passed on when teaching students how to do ‘translations’ 
between formal and natural languages. They are, in the typical case, rules about the 
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fixed meanings to be assigned to the logical constants, the kinds of meanings assign- 
able to the members of each syntactic category, rules of compositionality, and so on. 
‘These are the rules borne in mind when one notes that, for example, “AxFx’ can be 
used to formalize the claims 


‘There's at least one prime number. 
or 

Someone is French, 
but not 

All cows are mammals. 


‘The second relevant difference between Fregean and standard contemporary sys 
tems is that the latter typically incorporate a model-theorctic apparatus. A model for a 
formal language is a function which, while meeting a variety of requirements specific 
to that language, assigns a truth-value to cach closed formula of the language, The 
standard requirements include, for example, the requirement that a model assigns 
‘true to a formula of the form (a & B) only if it assigns true to both «and B. For a 
quantified language, the assignment of truth-values proceeds via an assignment of, 
individuals and sets to the atomic parts of formulas. [See chapter 1.] 

Instead of assessing the adequacy of formal systems in the Fregean way, by 
directly examining the relationship between deducible formulas and the claims they 
express, typical practice with contemporary systems is to assess the adequacy of 
4 system by examining the relationship between deducible formulas and the truth: 
values assigned those formulas by various models. Where © is a set of formulas of 
a formal system $ and @ is a formala of S, 





© isa model-theoretic consequence in $ of E if every one of S's models that assigns 
rue to cach member of E also assigns sre to @. This is abbreviated. 


The 


© A formula 9 is a model-theoretic truth of S if every one of S's models assigns rrue 
we. 


A central question that arises fora formal system with a model-theoretic apparatis is 
that of the coincidence between the relation of model-theoretic consequence and 
the relation of deducibility. Abbreviating ‘9 is deducible in $ from 5? as ‘E49’, the 
two halves of the coincidence claim form the sewndnes and completeness theorems 
for 8, as follows: 


Soundness of S: For every set E of formulas of S, and every formula 9 of S, 
ifE +59, then ZFs@. 
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Completeness of S: For every set E of formulas of S, and every formula 9 of S, 
if Zee, then Eie@. 


IF one’s primary interest is in devising a formal system whose deductive and 
model-theoretic consequence relations coincide, then the soundness and complete- 
rnexs theorems are of interest in their own right. When, on the other hand, the 
purpose is the design of a formal system that will be a reliable indicator of logical 
consequence, these theorems are of interest langely because they allow one to infer 
the adequacy of each of these consequence-like relations (deducibility or model- 
theoretic consequence) from the other. If one knows that model-theoretic con- 
sequence within a system § is a reliable indicator of logical consequence, then the 
soundness theorem for $ will give a reliability result for S's deducibility relation. If, 
fon the other hand, one has an independent guarantee of the reliability of S's 
deducibility relation with respect t0 logical consequence, then the completeness 
theorem for $ will establish the reliability in this regard of S's model-theoretic 
apparatus, 

Because unlike their Fregean antecedents, standard contemporary systems take 
cach formula to be capable of formalizing a wide range of claims, somewhat more 
complexity is needed in formulating the questions of the reliability of the model- 
theoretic and the deductive consequence relations of formal systems. One cannot 
simply ask whether 9's being a model-theoretic consequence in $ of E entails that 
the claim expressed by @ is a logical consequence of the st of claims expressed by E, 
since there are no unique claim and set of claims expressed by @ and by E, respec 
tively, One wants, rather, to ask whether this implication holds for all of the claims 
and sets of claims expressible by @ and E respectively. Similarly for the relation 
hse. 

Consider, for example, the set of formulas (Vx( Fx -> Gx), Fa and the formula Ga 
Of standard first-order logic. These formulas could be taken to express, respectively, 
the claims 


All prime numbers are odd. 


Seven is a prime number. 
Seven is odd. 


‘On another occasion, these formulas might represent the trio 


All sheep are mammals, 
Dolly isa sheep. 
Dolly is a mammal. 
Fach such assignment of claims to the formulas of a formal language is what are 
called a reading of that language. That is to say, a reading of a language is an 


assignment of claims to its closed formulas in a way that satisfies the usual ‘rules of 
thumb,’ as mentioned above, for that language. When one asks whether deducibility 
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in a given formal system S is a reliable indicator of logical consequence, the interest 
is in whether this reliability holds for each of the readings of the language. Similarly 
for the question of the reliability of the model-theoretic consequence relation for S. 

Where E and @ are a set of formulas and a formula, respectively, of the language 
ofa formal system S, and R isa reading of that language, “R(E)’ shall mean the set 
of claims assigned by R to E, and *R(g)’ the claim assigned by R to @. For example, 
where R, is the first reading given in the above example, Ry(|Wx(Fx—> Gx), Fal) is 
the set of claims 


[All primes are odd. Seven is prime} 
and Ry(Ga) is the claim 

Seven is odd. 
Questions about the reliability of the deductive and model-theoretic consequence 


relations of a formal system S, then, can be expressed as the questions of whether it 
is generally true that: 





If Ey, then R(g) is a logical consequence of R(E) 
(ii) IF E¥,@, then R(@) is a logical consequence of R(E). 


IF (i) holds forall Z, @, and R for a formal system S, then S's deducibilty relation is 
reliable. If (ii) holds for all E, @, and R for a system §, then S's model-theoretic 
‘consequence relation is reliable 

It is sometimes assumed that, at least for those formal systems standardly in use, 
the model-theoretic consequence relation is ‘automatically’ a reliable indicator of 
logical consequence, which is to say that (ii) is obviously satisfied by such systems, 
‘This assumption tends to rest on the view that the relation of model-theoretic 
consequence is merely a tidied up version of, or a successful re-description of, the 
pretheoretic relation of logical consequence itself If this is the case, then the ques 
tion of the reliability of the deducibility relation is immediately reducible to the 
question of its satisfction of the soundness theorem. This assumption of the coin- 
cidence between model-theoretic consequence and logical consequence has, however, 
been challenged, and the grounds for inferring deductive reliability from soundness 
are by no means obvious; sce Blanchette (2000), Etchemendy (1999 [1990]), MeGce 
(1992), Shapiro (1998), and Sher (1991) 

Satisfaction of (i) and (ii) are two of the central issues to be treated in establishing 
the accuracy of a formal system that has both a deductive and 2 model-theoretic 
apparatus. If one is interested in the use of the system not only to give postive 
judgments of logical consequence, but also negative such judgments, one will be 
interested in the stronger biconditional versions of (i) and (ii). 

‘The central difficulty in establishing cither (i) or (ii) (or their strengthened 
biconditional versions) is that there is no independent test for satisfaction of the 
consequent of cach. After all, if there already was a reliable test of logical con- 
sequence, one would not be in the position of devising formal systems to provide 
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such a test. Nevertheless, itis possible, in fact, to formulate some relatively straight: 
forward necessary conditions on logical consequence, some of which can be used to 
sve at least partial evaluations of formal systems. The idea, roughly, is that if C is 
some condition that must be met by the logical consequence relation, then no 
formal system whose deductive /model-theoretic consequence relation fails to meet 
can be said to satisfy (i)/(il). The next section looks at a small sample of such 
conditions. 


6.4, Conditions on Consequence 


‘The logical consequence relation is evidently truth-preserving, ic. if each of a set of 
claims is true, then s0 too are all ofits logical consequences. This provides a very 
‘minimal condition on formal systems: given a reading R of the language, it must 
never be the case that E,@ if each member of R(E) is true while R(g) is false, 
Similarly for Fs. It is a relatively straightforward matter to check for satisfaction of 
this condition, and itis indeed satisfied by all of the standard propositional and fist- 
corder formal systems [see chapter 1]. (For issues about second-order systems? satis- 
faction of this criterion, see below.) 

Presumably, however, the intention is for a much stronger connection between 
premises and conclusion than mere truth-preservation when one says that the latter 
is a logical consequence of the former. One assumes, for example, that agents are 
committed to the logical consequences of the claims they explicitly avow, though of 
course without presuming them to be committed to all of the truths in virtue of a 
commitment to one of them. Hence one assumes that unanticipated commitments 
can be discovered simply by following out the consequences of the explicit claims. 
Similarly for theories; the consequences of a theory's assertions are entirely within 
the realm of claims on the basis of which the theory is to be found adequate or 
wanting, whether or not one takes theories themselves to be closed under logical 
consequence. In short, an important feature of logical consequence is that it srans- 
‘its epistemic and theoretical commitment. 

In addition to transmitting commitment, logical consequence would appear to be 
«pistemically inert, in the sense that the logical consequences of things knowable # 
Priori, or knowable non-empirically or without the aid of intuition, are themselves, 
respectively, knowable # priori, non-empiricaly, without the aid of intuition. For 
any kind K of objects, things knowable without access to objects of kind K pass on 
this property to their logical consequences. There is a rough sense, then, in which 
the logical consequences of a claim have no ‘new content’ over and above that had 
by the original claim. Whether this conception of content can be characterized 
sufficiently cleatly, independently of the relation of consequence, to provide much 
clucidation here is unclear, for now, it will suffice to note that the consequence 
relation preserves the epistemic categories just noted. 

Finally, there is, it is usually agreed, a certain modal characteristic of logical 
consequence. The fundamental idea here is that there is a necessary connection 
between a claim and its logical consequences: if a claim ais a logical consequence of 
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a set of claims I, then itis nor pessible for a to be false while the members of Tare 
‘ruc. Similarly, if an argument is valid, then it is imposible for its premises to be true 
while its conclusion is false. 

All of these features of the logical consequence relation provide conditions that 
must be met by any reliable deductive or model theoretic account of consequence. 
Some of them are more easily formulable and systematically tested than others. This 
section looks briefly at the criterion given by the last mentioned condition, namely 
the modal character of logical consequence. 

‘The relevant criteria of adequacy for the deductive and model-theoretic con 
sequence relations for a system $ are that: 


(#) TEE Fe, then itis impossible for each member of RI) to be true while R(@) 
is fabe 

(ii) LF, 9, then ic is impossible for each member of R(E) to be true while R(@) 
is fase. 


AAs usual, R isan assignment of claims (meeting the usual ‘rules of thumb’) to S's 
formulas. Only if a formal system $ satisies both (i) and (i) for every £, @, and 
R will that system's model-theoretic and deductive consequence relations prove 
reliable indicators of logical consequence. Satisfaction of (i) is relatively easily estab- 
lished (or refuted); one simply checks each axiom to see that it expresses only 
necessary truths, and checks each rule of inference to see that it preserves this 
property, In the case of propositional logic, for example, one simply notes that the 
axioms - instances of a small handful of forms, like 


(A&B) A) 


‘express only necessary truths, and that the rule(s) of inference (¢.g., modus poncns) 
‘generate only necessary consequences. 
‘A similar argument almast suffices for the usual first-order deductive systems 


‘The only difficult point here concerns the ‘non-empty universe’ assumption built 
{nto standard first-order systems (see chapter 1]. Such formulas as 


Gee 
and 
(BeBe Fe) 


are deductive theorems of standard firs-order systems. But the claims expressible 
by these formulas are not uncontroversially necessary truths, and hence not 
"uncontroversally logical consequences of the empty set. If indeed these formulas 
do express non-necessary truths, then standard first-order systems fail to satisfy (i), 
and hence fail to satisfy (j). Aside from these existential formulas, however, itis 
relatively uncontroversial that standard first-order systems do satisfy the modal 
requirement (i’) 
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In both the propositional and first-order cases, the completeness theorem can be 
used to show satisfaction of (i') via satisfaction of (¥’). So standard propositional 
systems do, and standard first-order systems either do satisfy, or almost satisfy (i’) as 
well 

It is also possible to establish satisfaction of (i’) directly in certain cases, and it is 
instructive to see how non-trivial this can be, even in the case of propositional logic. 
Here, for example, is an argument adapted from one by Cartwright (19872) that 
‘establishes satisfaction of (i’) by standard systems of classical propositional log 
Let the language L be a set of formulas freely generated from a non-empty set of 
formulas by the binary operation & and the unary operation N. Let a valuation be 
a function from L into (T, F]. Where Fis any relation that takes sets of wis to ws, 
and Vis a set of valuations, say that V inducer iff for every set X of wis and every 
wif a, X' a iff, for every ¥ € V, if (x) =T for every x X, then »(a)=T. Roughly 
speaking, the more valuations V cootains, the smuller will be the relation induced by 
V; where Vis empty, the induced relation is universal, and where V contains every 
valuation, the induced relation will be minimal, amounting merely to membership of 
the conclusion in the set of premises, ic., Xa iff a € X. Say that a valuation v is 
Boolean iff, for all and B, »(a) # »(Nax), and »( Sa) =T iff v(a) = »(B)=T. Let ky 
be the relation induced by the set of Boolean valuations. Notice that Fy is the 
relation of truth-table implication; to say that EF, @ is to say, essentially, that any 
row ofa standard truth: table (treating *& as conjunction and ‘N’ as negation) that 
assigns T to every member of E will assign T to @. 

‘The question of the satisfaction of (i’) by a standard propositional system is the 
question of whether, for every set T of wfis and every wif @, if Px, then the 
appropriate necessary connection obtains between the claims made by the members 
of F and that made by @. Here, something must be known about the claims 
expressible by L's formulas, Let these be governed by the usual constraint: 














(C) For all formulas a and i, Na expresses the negation of what a.expresses, 
and &aB expresses the conjunction of what a and B express, 


Say that a valuation is admimible if it represents a possible distribution of truth- 
values to the formulas, given the constraint (C) on readings. Thus ¢., a valuation 
assigning T to some wit (well-formed formula) a and to Na is not admissible, since 
no claim and its negation can both be truc. Fy satisfies (i’) just in case’ every 
admissible valuation is Boolean. So suppose » is not Boolean. Then either: 


(a) for some wif a, »(a) = »(Na), in which case vis not admissible (since itis not 
possible for a claim and its negation to be both true or both flse); oF 

(b)_ for some wels 6 and B, either »(8aB) = F and (a) = »(B) =, in which ease » 
is not admissible (since it is not possible for two claims to be true while their 
conjunction is false), oF »(8a)=T and cither #(a) = F or »(B)=F, in which 
‘ase 1 is not admissible (since it is not possible for a conjunction of claims to 
bbe true while one conjunct is false. 





So every admissible valuation is Boolean. QED 
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For systems lacking a completeness theorem, e.g., typical systems of second-order 
logic [see chapter 2], establishment of (i), if indeed (i’) holds, must be by some 
such direct method. The question of the modal adequacy of second-order systems is 
too large to treat in detail here, but some of the relevant concems are as follows. 

Finst of all, a complication is that the question of precisely mich claims a given 
formula can be taken to express is considerably less clearly answered for second- 
onder pyreeme than it I foe frst-onder and propositional systems. For example, the 
formula 


AXVy(Xye y= 9) 


‘might, or might not, be taken as an appropriate formalization of the claim that there 
exists set of all the selFidentical things, If it is taken to be capable of formalizing, 
such a claim, then the system in question will fil (i), since the formula is a model- 
theoretic truth, but the claim is not a necessary truth, and is indeed a falsehood. If, 
‘on the other hand, this reading of the formula is ruled illegitimate, then this particu 
lar counterexample to (i’) is not available, Similarly for a large number of potentially 
problematic model-theoretic truths of second-order logic; on some construals of the 
expressive power of the language, a variety of such formulas express false claims, 
‘Though on such understandings of the language, the model-theoretic consequence 
relation clearly fils to reliably indicate logical consequence, this does not indict the 
deductive consequence relation, since again such formal systems lack a completeness 
theorem, The reliability of the deductive system, ic., the satisfaction of (i) and of 
(i?) is of course to be established by looking at the details of particular second-order 
deductive systems; see Shapiro (1991); [see also chapter 2]. 

For any of the criteria outlined (truth-preservation, topic-neutrality, necessity, 
tc.) the task of checking the reliability of a particular deductive system is relatively 
straightforward, since satisfaction of the criteria by the deductive system asa whole 
‘ean be traced 0 the satisfaction of these very criteria by the relatively manageable 
collection of axioms and rules of inference. Checking the reliability of model- 
theoretic systems, particularly in the absence of a completeness theorem, is often a 
considerably more difficult matter. Arguments here will sometimes turn on ad hoe 
features of the model-theoretic output of a given system. A nice example of such a 
feature arises in the case of second-order logic with respect to the continuum 
hypothesis. This example is discussed by Etchemendy in (1999 [1990], ch. 8.) 

‘The continuum hypothesis is the hypothesis that there are no sets whose cardinality 
is larger than that of the natural numbers (N) and smaller than that of the real 
numbers (R). It is generally agreed (following the work of Godel and Cohen) that 
‘the continuum hypothesis (CH) is independent of the axioms of ZEC (Zermelo- 
Fraenkel set theory with the axiom of choice). [See chapter 3.] 

‘The status of the continuum hypothesis makes a difference to model theory. 
When asking whether a given formula is true on every model (ie, is a model- 
theoretic truth), one is asking whether there exist models which falsify that formula. 
‘Thus which formulas turn out to be tric on every model will depend to a certain 
‘extent on what kinds of models — i.¢., on what kinds of sets ~ there are. Because, in 
second-order logic, the properties being of smailer/larger cardinality than Nand 
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being of smaller/larger cardinality than R. can be defined, there are sentences of 
second-order logic whose status as model-theoretic truths will depend on the dis- 
position of the continuum hypothesis. Specifically, if the continuum hypothesis is 
true, then this sentence will be true on every model: 


(WXXX>NOR=X) (6.1) 


where * 
noted. 

If the continuum hypothesis is false, then there will be models the powerset of 
‘whose domain contains sets larger than N but smaller than R, and hence models 
‘which falsify (6.1) In this case, however, the sentence 


NT and ‘R's ..." are abbreviations for the definable properties just 





(AX)X> N+ XYX>N&X<R) (6.2) 


will be true on every model. 

‘This would seem to pose a problem for the view that the model-theoretic truths 
Of such a language are always logical truths ~ and hence for the view that model- 
theoretic consequence in such a system reliably indicates logical consequence, For 
assuming that the continuum hypothesis really is, as above, independent of the 
axioms of ZEC, one knows that neither the continuum hypothesis nor its negation 
is a truth of logic. For no truths of logic are independent of ZEC. And if the 
continuum hypothesis is not a truth of logic, then it is nota truth of logic that every 
set larger than NV is at least as large as R. And since (6.1) simply says that every set 
larger than Nis at least as large as R, one must conclude that (6.1) is not a truth of 
Jogic. Similarly, if the negation of the continuum hypothesis is not a truth of logic, 
then itis not a truth of logic that if there are sets larger than NN, then there are sets 
larger than Nand smaller than ®.’ So, from the fact that the negation of the 
continuum hypothesis is not a truth of logic, one must conclude that (6.2) is not a 
truth of logic either. Hence the problem: Either (6.1) or (6.2) is a model-theoretic 
‘uth, but neither (6.1) nor (6.2) isa truth of logic. So, at least one model-theoretic 
truth is not a truth of logic. 

‘A potential response to this problem is that it simply shows that there is no firm 
boundary between set theory and logic, hence no firm boundary between set- 
theoretic truth and logical truth. This may well be so. But itis not of much help for 
the view that model-theoretic consequence and truth in this system reliably indicate 
logical consequence and truth. For, to defend the reliability of the model-theoretic 
account, one must hold that either (6.1) or (6.2) isin fact a logical truth, and hence 
that either the continuum hypothesis or its negation is a logical truth. But this 
conflicts with the uncontroversial independence results which prompt the problem 
in the first place. If logical truth and set-theoretic truth come to the same thing, 
then the independence results demonstrate that neither the continuum hypothesis 
nor its negation is a set-theoretic truth, in which case there is no support for the 
view that either (6.1) or (6.2) is a logical truth. If there are set-theoretic truths 
which are not logical truths, then the fact (if itis one) that cither the continuum 
hypothesis or its negation is a set-theoretic truth does not support the view that 
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cither (6.1) oF (6.2) is a logical truth. In either case, this example would seem to 
provide a reason for deeming the usual second-order model-theoretic consequence 
relation unreliable as an indicator of logical consequence. This does not, of course, 
provide anything like an indictment of second-order logic in general, and, in par 
ticular, says nothing about the reliability of various second-order deductive systems 
as indicators of logical consequence. 


6.5. Different Formal Systems 


[Not every formal system is designed to reflect the fall extent of the logical con- 
sequence relation. Systems of propositional logic, for example, are intended to 
reflect only a small part of the consequence relation as it applies to the claims 
expressible in the languages of those systems. Thus, one way in which two formal 
systems can differ over their assessments of logical consequence is that one of the 
systems can reflect logical consequences not reflected, and not intended to be re 
flected, by the other. Such differences need not indicate any undertying disagree- 
‘ment about the extension (or nature) of the logical consequence relation; they can 
simply be viewed as more or less partial treatments of an agreed upon relation, 

However, more robust disagreements are possible as well, disagreements that 
stem from fundamental disagreements about the nature of the logical consequence 
relation itself, As noted above, there are those who hold that the logical con: 
sequences of a given claim must have a subject matter that is, in some sense, relenant 
to that of the claim itself. On this view, for example, one cannot validly argue from 
the premises 


Jones is wise. 
and 


Jones is not wise. 


Smith is athletic. 


Standard propositional and quantified systems of logic count the formalized version 
Of this argument as both deductively and model-theoretically valid, with the result 
that the relevance theorist most take those systems to be uareliable indicators of 
logical consequence. These theorists argue that more reliable indications of con- 
Sequence and its related logical notions are given by alternative systems of logic, 
called systems of rlerance logic [see chapter 13]. 

Similacy, systems of intuitionist logic are prompted by the perceived unreliability 
of classical systems. For the intuitionst, tis simply not the casein all domains (for 
example, when dealing with mathematical existence assertions) that for each claim @, 
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the corresponding disjunction cither @ or not-@ is always true. On this view, classical 
logic is wildly unreliable in its assessments of logical consequence. Intuitionist logics are 
those systems of logic designed to provide reliable indications of logical consequence 
and related notions as these are understood by the intuitionist [sce chapter 11 

Other disagreements with the classical conception of logical consequence have 
given rise to yet more alternative systems {see, for example, also chapters 12, 14, 15, 
and 16]. In all cases, the same principle is at work: a given conception of the 
pretheoretic relation of logical consequence prompts the construction of a particular 
kind of formal system, one that will give an accurate, systematic treatment of logical 
consequence and its related notions. 


6.6. Analysis of the Relation 


Finally and briefly, this section turns to the intensional question: What is it that 
‘makes one claim a logical consequence of others? A response to this question can 
take one of two forms. The first, dismissive, response is that the relation of logical 
consequence is primitive and unanalyzable, and hence that one cannot reduce the 
fact of ’s following logically from B to any more basic facts about A and B, or to 
‘any more basic relationship between them. The second form of response is to 
‘explain logical consequence in terms of more fundamental facts about A and B and 
their relationship to one another. 

Looking at the claims expressible by formulas of a particular formal system, one 
right be tempted to provide an analysis of logical consequence in terms of deducbility 
in that system, or in terms of truth-preservation across the models of that system, 
But a moment's reflection will make it clear that no such system-specific analysis of 
logical consequence can succeed in clarifying what logical consequence consists in, 
i.e, of what makes it the ease that certain claims are logical consequences of others. 
Deducibility within just any system will not do, since there are countless systems, 
easily definable, which count exactly the wrong things as logical consequences of 
others, Similarly for model-theoretic consequence. So the attempt to analyze logical 
consequence via deducibility or model-theoretic consequence must take the analysans 
here to be deducibility or model-theoretic consequence within a particular well- 
chosen system or kind of system. And the question then arises of what recommends 
that system or kind of system as an acceptable standard of logical consequence. The 
attempt to answer his question, however, threatens to return us to our original 
question, that of what makes one claim a logical consequence of others. 

‘An alternative approach is motivated by the fact that logically valid arguments 
come in patterns, patterns like Aristotle's sylogistic forms, or the argument schemes 
validated by formal systems, or even the natural language patterns emphasized in 
teaching critical thinking. Noticing this, itis tempting to define the logical proper- 
tics and relations in terms of these patterns. Thus, for example, one might define 
the logical truths as the instances of patterns each instance of which is truc, and 
the logically valid arguments as the instances of patterns each instance of which is 
truth-preserving (i... no instance of which has true premises and a false conclusion). 
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Whether such a characterization will be extensionally accurate will rum on 
what counts as 2 ‘pattern.’ The first difficulty here is that patterns themselves are 
definable only for a given language, and facts about which claims and arguments 
instantiate the same pattern will vary with the language in question. Consider the 
argument 


Jones and Smith are of the same height. 





Let £1 be a language in which this argument can be formalized as 


ij) = ms) 
w= 
Wa=a 


while L2 formalizes it as 


HG) 


TD) 
io) 


Making the obvious paraliel assumptions about the other kinds of claims formalizable 
by these series of formulas in L1 and 12, one can see that, although each argument 
formalizable by the Ll series is truth-preserving, this is not the case with the L2 
series. So the question of whether our original argument exhibits a pattern each 
instance of which is truth-preserving, and hence (on the current proposal) the 
question of whether that argument is valid, will depend on which language one has 
in mind when characterizing the ‘pattern.’ The frst problem, then, with the pattern 
analysis of the logical relations is that its detiverances will depend in unwanted ways 
fon the chosen language.* 

‘There will indeed be languages for which such a pattem characterization of the 
logical relations is extensionally accurate, languages in which the sentence pattems, €.8., 

((a&B) >a) 
each of whose instances is true, will tum out to be pattems each of whose instances 
expresses a logical truth. The clear candidates here are the formal languages of 
modern logic. Things are less tidy for languages not intentionally designed to have 
such a result. It is important for the extensional adequacy of the pattern character- 
ization of logical truth that, €.g. 

Smith did it for Jones’ uncle. 


and 
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Smith did it for Jones’ sake. 


ddo not count as instantiating the same pattern. And it is important for the attempt 
to (non-circularly) analyze the logical relations in terms of patterns that the different 
logical implications had by such pairs of sentences not be appealed to in distin- 
Buishing their patterns, In brief, the second difficulty of the pattern analysis of the 
logical properties and relations is this: Though one can, for certain specified formal 
languages, give an extensionally accurate characterization of the logical properties in 
terms of truth-preservation across patterns, this is no reason to suppose that the 
logical properties are duc to, or explicable in terms of, characteristics of these sen- 
tence pattems. For the formulas of these languages are expressly designed so that 
they will instantiate the same syntactic patterns when and only when the claims they 
express have relevantly similar logical properties. The two English sentences just 
displayed are formalized very differently because they express claims with very differ- 
ent logical implications; the logical properties are not recognized on the basis of the 
patterns. And when turing attention to natural languages, itis difficult to find a 
characterization of patterns that is plausible, extensionally speaking, without making 
covert appeal to the very logical properties and relations at issue (Etchemendy, 
1983), 

‘A perhaps more promising approach is the analysis of logical truth as a kind of 
analytic truth. The difficulties of characterizing analyticty itself are legion, but these 
are left aside here. The question is whether, granting for the moment the coherence 
of the notion of analytic truth, an account of logical truth can be given in terms of 
it, Where analytic truths are, roughly, sentences whose truth is due entirely t0 
‘matters of meaning (as opposed to matters of ‘fact’), the logical truths will be those 
whose truth is duc entirely to the meanings of a certain small, select group of terms. 
‘These terms, the ‘logical constants’, include the usual ‘and’, ‘or’, ‘not’, “for all’, 
‘exists’, perhaps ‘=", and terms definable in terms of these, Thus while 





All professors are academics. 
is arguably an analytic truth in the broad sense, 
If all professors are arrogant then all professors are arrogant. 


falls into the narrower camp of logical truth, since its truth is guaranteed simply by 
the meanings of its logical constants. 

‘There are at least two difficulties with this approach. The first is, as noted, that it 
is not entirely clear that sense can be made of the notion of analytic truth. The 
second is that this characterization of the logical properties and relations seems to 
appeal, once again, to the very things it is trying to characterize. To say that the 
‘meanings of a collection of terms ‘suffices for’ or ‘guarantecs” the truth of a sentence 
seems to mean little more than that the sentence's truth follows legicaly from facts 
about those meanings, or that its falschood would be legicalty inconsistent with those 
facts, etc. And if this is right, then one cannot, without vicious circularity, give a 
characterization of the logical properties and relations in terms of meanings. 
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‘This last problem, the circularity of the proposed analysis, would scem likely to 
pose difficulties for virtually any attempted analysis ofthe tight circle ofinter-defined 
logical properties and relations. For to give an analysis of logical truth isto say what 
it is about a given truth thar makes if 2 logical truth. Similarly for the notions of 
validity, logical consequence, inconsistency, and so on. But to say that certain fea 
tures F of a sentence or claim make that claim a logical truth is to say something 
dangerously close to saying that the sentence’s having, F entails that the sentence is 
a logical truth, And in saying this, then one has proceeded in a very small circle, 
Similarly when the analysis is given as a ‘reduction.’ One can try to informatively 
reduce the property of logical truth to a collection of non-logical features F of 
sentences or claims, holding that to say that a claim a: is a logical truth is just to 
say that a has features F, And this brief discussion has certainly not exhausted all of 
the possible ways of fleshing out such an attempted reduction. But the potential 
difficulty faced by all such attempts is that of saying precisely how F and the logical 
telationships in question are related, without making recourse to anything like 
‘entailment between the two. 

‘Analysis and reduction are typically intimately connected with the logical proper- 
ties and relationships; complex notions are analyzed in terms of simpler ones, or 
some are reduced to others, in part by noting logical connections between the 
analysans and analysandum. Facts about inconsistencies between affirmations of 
analysans and denial of analysandum, of ensailments between claims about one and 
claims about the other, and s0 00 are noted. If this general pattern is, in fact, 
necessary feature of analysis and reduction, then the logical properties and relations 
will be analyzable in terms of, and reducible to, only other members of the circle of 
logical properties and relations, and not to any outside it. If so, then we will have to 
bbe content with explanations of these notions that consist of making explicit their 
role in our overall semantic and other cognitive activities, but that do not give 
simple, informative answers to questions of the form, What makes this a logical 
‘consequence of that? 





‘Suggested further reading 


Perhaps the most provocative book written in recent years on the topic of logical consequence 
‘is Etchemendy’s (1999 [1990)), which provides «sustained critcim of the asmption that 
‘model: theorene consequence felabons geoerallypromde adequate anaes of lopeal conse: 
quence. Reactions to this eric can be fond in McGee's (2992) and Shapiro's (1991). 
‘The question of whether madel theoretic consequence relations can guarantee the requied 
‘modal connection berween premises and conclusion is treated in Shapto's (1991) and in the 
author's paper (Blanchette, 2000). For dscusion of the bearers ofthe logical relations, and 
particulatly ofthe dificuies involved in supposing the cxstnce of nonlinguistic propos 
tions, se€ Quine (1983b [1948], 1970, exp. ch. 1. Also see Cartwright (1987e) and Stawson 
(1952, esp. ch 1. For the class critics of the notion of analytic truth and related notions, 
together with an influential treatment of logical ruth, sce Quine (1953e [1951)), while 
Stawson (1971 (1987]) gives «response. A number of useful papers on related topics can be 
found in Hughes (1993). Haack (1978) presents a very readable discussion of many ofthese 
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Notes 


1 The problem arises from Frege's assumption that, to every predicate or open sentence, 
there corresponds 2 set-like entity called an aazenion. As Russell's Paradox [See chapter 3] 
shows, this assumption is fab. See Russell's letter to Frege of 16 June 1902 and Frege's 
response of 22 June 1902, both translated and printed in Frege (1980, pp. 130-3). See 
also Frege (1964, Vol. Il, Appendix, pp. 12742). 

2. Some free logics, 0-clledsiveral fre logics, eject the ‘non-empty universe’ assump- 
tion, and thos deny that these formulas and, ¢g., “VF > 3xPs express necessary truths 
Isce chapter 12}. 

43 Assuming, of coure, that itis not a truth of logic that there are no sets larger than N. If 
‘ici, then both (6.1) and (6.2) will be truths of logic, but s0 too wil be the continuum 
hypothesis 

4 This diffcuby remains even when the claims are, a8 above interpreted sentences. For we 
[presumably want a sentence $10 count a 2 logical truth if all sentences synonymous with 
‘tare aswell, and this will not generally be the case on the proposed account. 
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Chapter 7 


Modal Logic 
M. J. Cresswell 


Modal logic is the logic of necessity and possibility, of ‘must be’ and ‘may be’, 
‘These may be interpreted in various ways. If necessity is necessary truth, there is 
‘alethic modal logic, if it is moral or normative necessity, there is deontic logic {see 
chapter 8]. It may refer to what is known or believed to be true, in which case, there 
is an epistemic legic (chapter 9), oF to what always has been or to what henceforth 
always will be true, which gives an aspect of temporal legic (chapter 10]. Another 
interpretation is to read ‘Necessarily p’ as ‘it is provable that ’. This chapter will 
present the general framework of modal logic applicable to all of these, though with 
emphasis on alethic modal logics. 

In this chapter, the symbol *L’ represents the necessity operator, with ‘Zp’ to be 
read ‘Necessarily p’. Correlative to this is the possibility operator, ‘AC, with “Mp” 
being read ‘Possibly p*. (‘0 is often used instead of “L’ and “0” instead of “M"; *N’ 
is also sometimes used instead of *L’.) Either operator may be defined in terms of 
the other. Ths, if the modal language contains ‘L’ as a primitive operator, then 
‘Ma? may be defined as ‘~L~a’, for any formula a. Impossibility may similarly be 
expressed by '~AP (or “L~'); contingent propositions are those that are neither 
necessary nor impossible. 











7.1. Propositional Modal Logic 


This section offers a study of propositional modal logics, after which section 7.2 
examines the place of modal operators in first-order predicate logic. This section 
is confined to modal logics that are extensions of classical logic [see chapter 1], 
although it is also possible to form non-classical modal logics, e.g., by extending 
intuitionistic logic [chapter 11] or relevant logic [chapter 13]. For the language of 
propositional modal logic, assume the language of the classical propositional calcu 
lus, PC, based on propositional variables, p, 4, 7... etc., and ~ (for negation) and 
¥ (For disjunction), with other truth-functional operators being defined in the usual 
ways. To this, add the new monadic operator L with the understanding that the 
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formation rules for PC formulas apply to all the formulas of the extended language 
and with the additional rule: 


If ris a well-formed formula (wif), then so is La 


A. gatem of (propositional) modal lagic may be defined as a class $ of wif. A 
formula @ is a cheorem of $ — oF +s. ~ iff (if and only if) a ES. The logics studied 
here will all be normal modal legics, which are extensions of a minimal system called 
K 

K is defined axiomatically as the class of all wif that may be obtained from these 
five axioms and transformation rules: 


PC [fais a valid wff of PC, then a is an axiom of K 


K Lip 4) > (lp Ly) 
US Uniform substitution The result of uniformly replacing any variable or 
Variables pi... ia a theorem by any wff Bh... respectively is a 


theorem. 
MP Modus ponens, or Detachment If ¢ and a> B are theorems, 9 is B. 
N (Necesitation) If ais a theorem, s0 is La. 





Other systems of modal logic will be formed by adding additional axioms to this 
base, with the result being closed under the three rules. 


711. Validity 


‘Modal logic, as the logic of necessity and possibilty, takes into account not only the 
truth and falsity of the way things actually are, but also what would be true or false 
if things were different. If one thinks of the way things are as the actual world, one 
‘may then think of how things might have been different as how they are in alterna- 
tive, non-actual but possible worlds. As logic is concerned with truth and falsity, 
‘modal logic is concerned with truth and falsity in other possible worlds as well as 
the real one. A proposition is then necessary in a word just in case it is true in all the 
‘worlds that are possible alternatives to that world, and possible justin case it is true 
in some alternative possible world. 

This provides the basis of our formal definition of validity for modal logic.! A 
frame is an ordered pair (W, R), where W is a non-empty set of objects (worlds), 
and R is a binary relation defined over the members of W. R is often called 
a relation of ‘aternativeness’ or ‘accessibility’; when wR" is sometimes expressed by 
saying w ‘can sce” w’. A model's an ordered triple (W, R, V) where (W, R) is a frame 
and V is a function assigning values to wfis at worlds w€ W, according to the 
following conditions: 


[Vpv] For any propositional variable, g, and any w € W, cither V(p, »)= 1 or 
Vip, w)=0, but not both. 
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(V~] For any wif, o, and any w € W, V(~a, »)=1 if (a, »)= 0; otherwise 
V(~a, w)=0. 


[Vv] For any wif a and B, and any wE W, Vi(avB), ») 
V(q, w)=1 oF V(B, »)= 1; otherwise V((av B), #) = 





if either 





and, of special interest for modal logic, and following the informal interpretation of 
necessity as truth in all alternative possible worlds, 


[VL] For any wif and for any # € W, V(La, w)=1 if V(a, w’) = 1 for every 
w’ © Wsuch that wRw’; otherwise V(La, w) = 0. 





Evaluation conditions for other, defined operators are just what one would expect 
[sce chapter 1), though it is convenient to note: 


[VM] For any wif a and for any w © W, V(Ma, w)= 1 if Via, ’)=1 for 
some w’ & W such that wRw’; otherwise V(Ma, w) = 0 


‘A model (W, R, V) is stid to be based om the frame (W, R). 

Validity is defined by saying first that a wif a is nalid in a model (W, R, V) if for 
every w € W, Via, w) =1. Then a wif a is said to be valid on a frame (W, R) iff 
is valid in every model based on that frame. Specific sorts of validity are defined by 
specifying relevant classes of frames on which formulas are valid. Thus, a wi is said 
to be valid with respect to a class of frames F (valid) iff it is valid on every frame 
in , and a system is ound with respect to F iff every theorem in S is F-vali. S is 
complete with respect to F iff every wf that is Fvalid is a theorem of $. When S 
is both sound and complete with respect to F, F is said to characterise 8, For a 
particular frame F=(W, R), if every theorem of $ is valid on F, Fis said to be a 
frame for 8. 

A wif is K-vali iff it is valid. on every frame. 


‘Theorem 7.1 Every theorem of K is K-valid. 
‘To prove this, it suffices to prove that 


1 every axiom of K is valid on every frame, and 

2. the nules US, MP and N preserve validity on a frame ~ ie. that if they are applied 
to formulas which are valid on a frame, the reslting formulas are also valid on 
that frame. 


Iris more useful, however, to prove a more general theorem, from which theorem 
7.1 follows immediately 


‘Theorem 7.2 If Ais a set of modal wif and Fis a clas of frames such that every 
member of A is valid on every member of F, then K+A is sound with respect 
oF 
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where K+ A is the system obtained by adding all the formulas of as extra axioms. 
to K with the result closed under the rules US, MP and N. Like theorem 7.1 
theorem 7.2 is proved by induction on proofs in K +A; this amounts to showing, 
that every axiom of K is valid on any frame in F and that the rules preserve validity 
fon a frame, since it is given that all the axioms in Aare valid on all the frames in F. 
‘Theorem 7.1 follows from theorem 7.2 by taking F to be the class of all frames. 
‘Theorem 7.2 is important for the following reason. K is the weakest modal system 
discussed here; each of the other systems is a proper extension of it. Usually, these 
other systems are defined axiomatically by adding one or more extra axioms to the 
basis of K; these are the wif in A. For each such system K+ A, there is also (or 
at least one ties to find) a definition of validity which matches it in the way that 
K-validity matches the system K; i. which is such that the theorems of the system 
are precisely the wif which are valid by that definition. Theorem 7.2 establishes that 
to prove that K + A is sound with respect to a class of frames F, it suffices to show 
that every member of A is valid on every frame in F. 








The gutem TIE necessity is thought of as necessary truth, it is natural to expect the 
formula 


T Wp 


to be valid, for it says merely that if p is necessarily ruc, then itis true, Nevertheless, 
‘Tis not K-valid. (Consider a frame ({m, w'}, R) where wRw’, and w’Rw’, but not 
wRw; if pis false at w but true at w’, then T is false at w.) By theorem 7.1, this 
means that T is not a theorem of K. Let T be the result of adding T to K; ic, in the 
notation above T= K+ |T], or, more simply, K + T, or KT. By theorem 7.2, since 
‘T is valid on every reflexive frame (W, R), ic, every frame in which R is reflexive, 
the system TT is sound with respect to the class of all reflexive frames. (Notice that 
the frame just described to falsify T is not reflexive.) 








The sytem D_ If, however, L has a deootic interpretation, if, that is, it expresses 
obligatoriness (‘normative necessity’), Lp > p would not be regarded as valid, since 
it would mean that whatever ought to be the case actually is the case. More plausible is 


D Ip Mp 


which says that whatever is obligatory is permissible, which sounds reasonable enough. 
[See chapter 8 for further discussion of deonticintepretations.] 

Adding D as an axiom to K produces the system known as D. D is clearly a 
theorem of T, so D is included in T, just as K is included in D. It is worth noting, 
that since L(p > p) is in D, then because of D, so is M(p> 9). Mip> p) is not, 
however, a theorem of K, for K has no theorems of the form Ma. Any extension of 
that does have theorems of the form Met will contain D, and so be at least as 
strong as D. 

To define validity for D, notice that some frames may have so-called ‘dead end” or 
“blind” worlds, worlds which have no accessible alternatives or which cannot see any 
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‘world in that frame at all. For such worlds » [VL] entails that La is (trivially) true 
in w, no matter what ais (even p A ~p), and [VAf] entails that Mais (trivially) false 
in w. Hence, if a frame contains any dead end or blind world w, then D is not valid 
‘on that frame. Since there are such frames D is not K-valid and therefore not 
theorem of K. Consider, however, the class of frames containing no dead ends, ic., 
for every w € W, there is a w” € Wsuch that ww’. Such a relation R is called serial, 
and stich frames serial frames. D is valid on every serial frame. Hence, by theorem 
7.2, D is sound with respect to the class of all serial frames. (The model above that 
showed that TT is not K-valid is based on a serial frame; hence T is not Devalid 
cither, and so not in D. Hence, T is a proper extension of D. Any reflexive frame 
will, of course, be serial.) 


7.1.2. Iterated modalities 


Some formulas in the language of modal logic contain sequences of the modal 
operators L and M, e.g., the formulas Lip, LMLp, etc. Such sequences are called 
iterated modalities. Under some interpretations, itis difficult to know how to under- 
stand such formulas intuitively. In some systems, however, certain iterations may be 
replaced by shorter ones, which helps to simplify the problem. A theorem of equiva: 
lence which allows such replacement is called a reduction Jaw of any system of which 
itis a member. Here are the four most important of these: 


RI Mp= LMp 
R2 Ip = Mip 

R3 Mp= MMp 

RA Lp = Lip 

None of these is a theorem of T; indeed, T contains no reduction laws at all. T 
does, of course, contain Llp > Lp and LMp > Mp, and their equivalents, under the 
definition of M, Lp MLp and Mp> MMp. Hence, to extend T to contain reduc- 
tion laws, it is enough to add the converses of these formulas. Moreover, from 
Ip2 Lip, one can derive MMp> Mp (substituting ~p/p), and from Mp LMp 
‘one can derive MLp > Lp similarly. In addition, Lp > LLp is derivable from Mp > LMp, 
though Mp> LMpis not derivable from Lp > LIp. This suggests two extensions of 
‘T, one that results by the addition of Lp LLp as an axiom and one the results by 
the addition of Mp > LMp. (The latter will, of course, contain the former.) These 
are the systems known as S4 and SS, 


‘The gutem $4 SA is T with addition of the axiom 
4 lp Lip 


or K+T+4, Ifa modality is defined as any unbroken sequence of zero or more 
monadic operators (~, L, M), then in $4 every modality is equivalent to those 
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La 
4 
iMLa _ 
™ 
to Ma « 
Der 
MLM 4 
_ 


Figure 7.1 


shown in figure 7.1 oF their negations, where the arrows indicate implication rela 
tions. For negative cases, a corresponding diagram results by negating all the formu: 
las and reversing the direction of the arrows. Thus, in S4, there are exactly 14 
distinct modalities, and any modality may be reduced to one containing no more 
than three modal operators in sequence 

‘The $4 axiom 4, Lp > LLp, will be valid on any transitive frame, ic, any frame in 
which the relation R is transitive, Since $4 also contains T, $4 is sound with respect 
to the class of all frames that are both reflexive and transitive. (To show that axiom 
4 is not a theorem of T, it suffices to give a reflexive frame that is not transitive.) 


The gstem $5 S5 is T plus the additional axiom 
BE Mp> LMp 


or K+T +E, Since the $4 axiom 4 is provable in $5, $5 is an extension of $4, and 
a proper extension since E is not a theorem of $4. $S contains all the reduction 
Jaws RI-R4, which means that in any pair of adjacent modal operators one may 
delete the first. This procedure may be repeated indefinitely, giving the more general 
rule that, in $5, one may delete all but the last modal operator in any sequence, 
Hence, $5 has just six non-equivalent modalities: p, Mp, and 9, and their negations. 

‘Axiom E is valid on every frame in which R is Euclidean, ic., if wRw’ and Rw", 
then w’Rw”. Notice that if such an R is also reflexive, then it is symmetric and 
sransitive as well. Relations that are reflexive, symmetric and transitive are called 
equivalence relations. Since SS contains T, it is sound with respect to the class of all 
frames that are reflexive and Euclidean; hence, it is sound with respect to class of 
all frames whose relations R are equivalence relations. (This is how SS is usually 
described. Sometimes, however, itis useful to separate axiom E from axiom T, 10 
form systems weaker than $5, like K45 [sce chapter 9], and then it is important to 
characterize the relation R as merely Euclidean, or Euclidean and transitive, rather 
than a full equivalence relation.) 
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The gstem B This theorem of $5: 
B p> LMp 


has a special interest. B is not a theorem of $4, and if it were added as an extra 
axiom to S4, it would produce the system $5. If, however, B were added to T 
instead of to $4, it would produce not $5 but a weaker system which neither 
contains nor is contained in $4, This system is called the Browwerian stem, and B 
the Browwerian axiom. The system B is T+B, or K+T +B. B is valid on every 
frame in which R is symmetric. Since T is valid on every reflexive frame, and 
B=K+T+B, theorem 7.2 entails that B is sound with respect to the class of all 
frames which are both reflexive and symmetric. 

Adding B to $4 gives $5; since $4 is weaker than $5, it follows that B is not in 
‘S4, and hence that $4 does not contain the system B. Nor does B contain $4 since 
there are reflexive and symmetric frames that are not transitive. So B and $4 are 
independent systems, in the sense that neither contains the other, though each lies 
between T and $5. 


Other srtems There are infinitely many modal systems that may obtained by adding 
extra axioms to K. Already some others can easily be defined. For example, instead 
of adding 4 to T to produce $4, one could add it merely to K or to D, The 
resulting systems are often called K4 and KD4 respectively. If K4-frames are 
defined as those which are transitive (whether or not they are reflexive), and KD4- 
frames as those which are both serial and transitive, then the results proved so far 
sutfice to show that all the theorems of K4 are valid on all K4-frames and all the 
theorems of KD4 are valid on all KD4-frames. Similarly, one could add B to K or 
to D instead of T, to obtain the systems KB and KDB, which are sound with 
respect to the classes of symmetrical frames and serial and symmetrical frames respec 
tively. K45, mentioned above, is sound with respect to the class of frames that are 
transitive and Euclidean, and KD45 sound with respect to the class of serial, tran- 
sitive and Euclidean frames. 


7.13. Completeness 


‘The canonical model for a system $ is a special model with the property that a wif 
@ is valid in it iff +. This section shows that every (consistent) normal modal 
system $ has such a model. This fact makes it easy to establish completeness results, at 
least for the systems being discussed, by the following strategy. Suppose there is a 
class F of frames, and suppose one can establish thatthe frame of the canonical model 
of Sis in F, then if ois Fvaid, «will be valid on the frame of the canonical model for 
SS, and so a fortiori valid in the canonical mode! itself. But that entails that Fs a. 
So if ais Fvalid then Fy, which is what the completeness of $ with respect to 
F means. Thus, for each specific system, it suffices to show that the frame of its 
canonical model is indeed in F. For K, this is immediate since the relevant class Fis 
the class ofall frames. For D, Fis the class of serial frames and so it must be shown 
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that the frame of the canonical model for D is serial; for T it must be shown that it 
is reflexive; for $4, B, $5 that itis reflexive and, respectively, transitive, symmetrical, 
and Euclidean, and similarly for other systems. 

Following procedures familiar from non-modal logic [see chapter 1, section 1.9], 
set of wif, A, is sad to be S-consisten if there is no finite collection @,... ,0, € A, 
such that Fy ~( 4+ A.G,), where $ is any normal modal logic. Fis maximal iff 
for every wif a either @ET oc ~GET. Fis maximal consistent with respect 19 
S iff it is both maximal and $-consistent. Then lemma 7.3 holds simply because $ 
contains the classical propositional calculus (PC). 





Lemma 7.3 If A is an S-consistent set of wif, then there is a maximal 
S-consistent set of wif T such that ACT. 


‘The next lemma is appropriate to modal logics. Where A is any set of wff of modal 
logic, let L-(A) be the set consisting precisely of every wif B for which LB is in A; 
ive, L(A) = [B: ELBE A) 





Lemma 7.4 If $ is any normal propositional modal logic, and A is an 
S-consistent set of wif containing ~La, then L-(A) U [~a} is S-consistent 


Proof Suppose L(A) U |~a| is not S-consistent. Then there are B,,...,B, of 
L(A), such that Fy ~(B, 4+ «+ AB, 4 ~a), and 90, by PC, Fy (By A +--+ B,) Da. By 
Principles of K, ty (LB A+++ LB.) 2 Let, and 90 +s ~(LB; 4+ +- LB, 4 ~La). 
‘Thus the set (LB),..-, LB ~Le) is not S-consistent, Since each B, € L*(A), each 
LB, € A; tis given that ~Lo€ A, $0 {LB,,...» LB, ~La} © A. Hence, under the 
hypothesis, A is not S-consistent. Therefore, if A, containing ~ Za, is S-consistent, 
L(A) U (~a) is S-consstent QED 

Now, define the canonical model for S as the triple (W, R, V) in which: Wis the 
set of all maximal $-consistent sets of wff; if w and w” are both in W then Rw’ 
iff for every wif B if LBE w then BE w’, or, using the L° notation, wRw’ iff 
L-(m) C ws and Vip, )= 1 iff pew. 








Lemma 7.5. If (W, R, V) is the canonical model so defined for a normal 
propositional model system S, then for any wif a and any # € W, V(a, w)=1 iff 
aew. 


Proof’ The result is defined to hold for the propositional variables; for complex wi, 
it may be proved by induction on the construction of wif. The only complicated case 
isthe induction for L. 


(a) Suppose that La € w. By definition of R, 0 € w" for every’ such that wRw'. 
Since the lemma is assumed to hold for a, Via, ®')=1 for every w” such that 
wRw’. Hence by [VE], ViLa, #)=1 

(b) Suppose now that La € w. Then by the maximality of sets w, ~La € w. Hence, 
by lemma 7.4, I-() U |~al is S-consistent. So by lemma 7.3 and the definition 
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of W, there is some w’ € W such that L(w) U [~a] C w’, and therefore such 
that 

(i) L(w) Cw and 

(ii) ~ae w’. 


By the definition of R, (i) gives wRw’, and since w’ is maximal S-consistent, (ii) 
gives a € w’. Hence, since this lemma is assumed to hold for a, V(a, w’)=0. So 
IVE] gives ViLa, »)=0. QED. 


‘Theorem 7.6 Any wif a is valid in the canonical model of $ iff Fs a. 


Proof Let (W, R, V) be the canonical model of S. First suppose +41. Then ois 
in every maximal S-consistent set of wf (from the definitions), Hence, ois in 
every w € W, and so, by lemma 7.5, V(qa, w)= 1 for every w € W; ic. ais valid in 
AW, R, V). Suppose now that not, Then [~a! is S-consistent and so, by lemma 
7.3; there is some maximal S-consistent set, ic. some w & W, such that ~a€ w 
and hence a € w. So by lemma 7.5, V(a, w)=0. Hence, if not-t,a, then a is not 
valid in (W, R, V), QED. 
‘Theorem 7,6 yields the completeness of many modal logics according to the 
strategy described above. For K, the result is immediate, since for K one takes F to 
bbe the class of all frames, which, trivially, includes the frame of the canonical model 
For extensions of K, it is easy to prove: 





Lemma 7.7 If $ contains the formula T, then the frame of its canonical model 
is reflexive; likewise if it contains D, the frame is serial; if 4, then itis transitive; if 
B then it is symmetric; and if 5 then it is Euclidean, 


From this, the completeness of all the systems so far discussed follows directly: 


‘Theorem 7.8 ‘The system K is complete with respect to the class of all frames, 
and the systems K4, KB, KS, K45, T, D, KD4, KDB, KD45, S4, B, and SS are 
complete with respect to the class of all frames that are respectively: transitive; 
symmetric; Euclidean; transitive and Euclidean; reflexive; serial; serial and tran- 
sitive; serial and symmetric; serial, transitive and Euclidean; reflexive and transitive; 
reflexive and symmetric; reflexive transitive and symmetric. 





‘These results may be considerably generalized. Notice that cach of the axioms 
characteristic of the systems so far discussed is equivalent to one of the form 


GUM MLpD Mp 


for k, 4, m, m= 0, where L'a is @ preceded by m Ls, and similarly for M'a. It is 
known that G'** is valid on a frame (W, R) iff the frame mects this condition: 
(gY**) for all w, w’, w” € W, if wRtw’ and wR"w", then there is a w” such that 
wR'n™ and w"R'w™, where R* is the mth relative product of R, ic., wR°w" iff 
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w= w’, and if n> 0, wR*w’ iff there isa w” such that wRw” and w"R™'w', Further 
more, if § contains G""** then the frame of its canonical model does meet (g'***) 
Hence, theorem 7.9: 





Theorem 7.9 Systems KG“~* are sound and complete with respect to the class 
Of frames that satisfy conditions gh". 


Here KG!" is the result of adding any number of formulas G'** as axioms to K, 


‘Theorem 7.8 gives special cases of this. See Chagrov and Zakharyaschev (1997, 
p. 79) and Chellas (1980, pp. 85-90, 182-4).) 


7.14. The Finite Model Property 


Canonical models provide a very convenient way to demonstrate the completeness 
‘of modal logics, although, as shall be seen, this method does not work for all 
systems, The frames of these models are very large, however, and for many systems 
‘most of this size is wasted. A logic $ has the finite madel property iff it is character- 
ized by a class of finite models, or, equivalently, by a class of finite frames. Any 
system with this property is complete, and so demonstrating this property is some- 
times an alternative way of establishing completeness when the original method fails. 
‘Moreover, the finite model property is important because if a logic $ (whether 
‘modal o not) has the finite model property and itis finitely axiomatizable, then it is 
decidable, ic, there is an effective procedure that will determine in a finite number 
Of steps whether a given wff is a theorem of the system. 

It can be shown that a system $ has the finite model property by constructing, 
as it were, mini-canonical models; for each non-theorem its falsifying model will be 
constructed out of its own well-formed parts, Given a wif a, let ®, be the set of 
sub-formulas of a. Let W be the set of all @,-maximal S-consistent sets of wf, 
where a set is ©,-maximal iff for every BE ,, cither B or ~B is in the set. For 
S=K, D, or T, let R be defined as for canonical models above, and for each w € W, 
let V(p, »)=1 iff p € w. Its straightforward then to prove the analogues of lemmas 
7.3, 7.4, and 7.5 for the models (W, R, V) so defined. For these three systems, 
lemma 7.7 remains true, and so there is another proof of their completeness, but 
this time based on frames of finite size. 

‘The preceding does not apply to systems $4, B, or $5, however, because relation 
R s0 defined might not be transitive or symmetric or Euclidean, In regular canonical 
models for $4, for example, transitivity is guaranteed by Lp > Llp, so that if LB E w, 
LLBE m, but now there is no assurance that LLBE ®, just because LB is. The 
‘easiest way around this problem is to redefine R, so that now, for S4, wR" iff 
LB € w’ whenever LB Ew, forall wif. This relation is transitive and reflexive. For 
K¢, define R so that wRw’ iff both LB € w’ and B E w’ whenever LB € w. For B, 
define R so that wRw" iff BE w’ if LB € w and BE w if LB € w'. For $5, wRw’ iff 
LB € wif LB € wand LB wif LB E w’, and so.on for the other systems discussed. 
For each system, itis necessary to define a relation R that has the requisite proper: 
ties for frames for that system, This must be done system by system. Furthermore, 
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because R has been redefined from the regular canonical models, it is necessary to 
return to the proof of lemma 7.5 to verify that it continues to hold under the new 
definition. This may, however, be done for these systems, and for many others, with 
the result that they are thereby shown to have the finite model property? 


7.1.5. Beyond K, D, T, S4, B, and $5 


‘The systems K, D, T, $4, B, and $5 and their characteristic axioms have been 
introduced because they are historically and philosophically important, and because 
they are easily shown to possess a number of desirable properties. It has been shown 
that they are sound and complete with respect to their appropriate clases of frames, 
and that they have the finite model property, and, because they are finitely 
axiomatizable, they are decidable, They are also compact, in that every S-consistent 
set of wi is satisfiable in a frame for S, for $ being one of these systems, and they 
are canonical in that the frames of their canonical models are frames for these 
systems, 

Not all normal modal logics are so well-behaved, however, and the methods 
described above do not always apply. For example, the system KW = K+ 


W  Lilp> p)> Ip 


is neither compact nor canonical. (Although, trivially, every theorem is valid én its 
«canonical model, not every theorem is valid on the frame of that model.) As a result, 
the original proof of completeness will not work for this system. Nevertheless, using 
the method of mini-canonical models, KW may still be proved sound and complete 
with respect to the class of frames that are finite, ireflexive and transitive. For this, 
define R so that wRw’ iff 


(i) both LB € w’ and BE w’ whenever LB € w, and 
(li) there is some LB € w’ such that LB € w. 


‘This also shows that KW has the finite model property?” 


Not all normal modal logics do possess the finite mode! property, however. One 
that does not is Mk=T + 


Mk = -LiLIp> Lg) (1p> 9) 
‘Nevertheless, Mk is complete with respect to a fairly easily specifiable class of frames. 
‘Completeness too is not universal. There exist countiess systems that can be 
proved to be incomplete in the sense that there is no class F of frames such that their 
theorems are precisely the F-valid wif. One simple example is the system KH = K+ 
HH LUp=p)2Ip 

Interestingly, the class of frames for KH characterizes KW.) 
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Much current research in modal logic is devoted to the systematic study of large 
families of logics to examine the conditions under which they have, or lack, proper- 
ties like these.* 


7.2. First-Order Modal Logic 


Firstonder modal logic, or modal predicate logic, is an extension ~ grammatically, 
axiomatically and semantically — of the propositional modal logic discussed above; 
itis abo an extension of non- modal first-order logic, or the lower predicate calculus, 
LPC [see chapter 1). The language of modal predicate logic simply adds the modal 
‘operator L to the language of LPC, but the interpretation of this language intro- 
duces some interesting questions and complications. 

Fist: an obvious generalization of the interpretation of the non-modal language, 
A model now consists of a quadruple (W, R, D, V) in which (W, R) is a frame, 
as previously discussed, and D is a domain of individuals, To interpret the predicates, 
Of the language, each mplace predicate is assigned a set of n-tuples from D in each 
world, Intuitively, these are the (sequences of ) individuals that satisfy that predicate 
in that world, Alternatively, one may think of V as assigning to each n-place pred- 
icate a set of me-tuples, in each one of which the first m terms are from D and the 
final term is feom W. To say that (im, ty #) € V(6) is to say that @ is true 
Of thy + ty (in that order) in world w. 

More precisely, a mode! is a quadruple (W, R, D, V) in which Wis a non-empty 
set (worlds), Ra binary relation on W, D another non-empty set (individuals) and 
Va function such that, where @ is an avplace predicate, V(@) is a set of m+l-tuples 
cach of the form (i), tes #) FOF My», € D and w € W. In such a model an 
assignment jt to the variables isa fianction such that, for each variable x, (x) € D. 
Where pis aso an assignment to the variables, yt and p are x-aleermatinesiff for every 
variable y except possibly x, p(y) = ly). Every wif has a truth-value at a world 
relative to an assignment jt determined as follows 





[V0] VCO, =e m= Tif CHL), HL), w) E V(), and O otherwise 
[V~] V,(~@, w)=1 if V,(a, ) =0, and 0 otherwise. 
[W] V(avB, )=1 if V,(a, »)=1 oF V,(B, w)=1, and 0 otherwise 


[VV] V(¥xq, w)=1 if Vai, w)=1 for every saltemative p of 4, and 0 
otherwise. 


[VE] V,(La, »)=1 if Vj(a, W’)=1 for every such that wRw’, and 0 
otherwise. 





‘Our definitions of validity can be extended now to say that a wif a is valid in (W, R, 
D, V) iff V,(q, w)=1 for every w E Wand every assignment jt, and a is valid on a 
frame F=(W, R) iff cis valid in every model based on F. 
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Given a system of normal propositional modal logic, S, system LPC + S is defined 
as follows: 





Ifa.is an LPC substitution instance ofa theorem of $ then «is an axiom 
of LPC +S. 


MP If qand @ > Bare theorems of LPC+$ then so is B. 


WL Ifa: is any wff and x and y any variables and aly/x] is a with free y 
replacing every free x, then Va > afy/x] is an axiom of LPC+8, 


V2 If @ > Bisa theorem of LPC+$ and xis not free in a then a> Wxf is 
a theorem of LPC+ S. 


NN If.qisa theorem of LPC +S then so is La. 
‘An additional principle of considerable interest is the Barcan formula 
BE VxLa Lyxa 


‘named after Ruth Barcan (now Ruth Barean Marcus) who first introduced it (Barcan, 
1946), Systems $+ BF are systems LPC + § with the addition of BF. It tums out 
that for some propositional systems S, ¢.g. B and $5, BF is already a theorem 
schema of LPC + S, while for others, ¢.8. K, D, T or $4, itis not. This section only 
considers systems which contain this formula, whether as a derived theorem or as a 
postulated axiom. 

Many theorems of modal LPC are obvious instances of theorems of propositional 
modal logic, ¢.g. L(¥x6x > 3eysx) > (LVx¢xD L3eys), which is an instance of 
, while others are instances of theorems of non-modal LPC, e.g. VxL¢x > L6y. 
Maxlal predicate logic, however, also offers various mixed principles, like the Barean 
formula, BF, and its converse 





CBF Ivxa> Vso 
that exhibit interrelations among modal operators and quantifiers that cannot be 
stated in modal propositional logic or in non-modal LPC alone. (CBF is provable in 
LPC-+ K as it stands, and so does not need to be postulated separately.) 
Another mixed principle, easily proved without the Barcan formula, is 

(i) Akad L3xa 

Its converse is not provable, however, and, in fact, is not valid. Consider a case, 
(i) L3x9x> 3xLox, 


To see that (ii) is not valid under the intended interpretation, let @x be ‘x is the 
‘oumber of the planets’, Then the antecedent is true, for there must be some number 
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which is the number of the planets (even if only 0), but the consequent is false, for 
there is no number which must be the number of the planets since itis contingent 
how many planets there ar. 

Each of the following systems is sound and complete with respect t0 the class of 
frames listed beside it: 

K+BF: all frames 

D+BE: serial frames 

reflexive frames 

transitive frames 

symmetrical frames 

reflexive transitive frames 

reflexive symmetrical frames 

‘equivalence frames, 





(These results may be proved by adapting the method of maximal consistent sets 
previously described. In general, the completeness of $ + BF follows when the frame 
of its canonical model is a frame for S.)* 


7.2.1. De re and de dicto 


‘As seen in the components of the mixed principles that are proper to modal pred- 
‘cate logic, and not merely generalizations of propositional modal logic or of non- 
‘modal predicate logic, some formulas in modal predicate logic have a variable x free 
inside the scope of the modal operator L and some formulas have no free variables 
inside the scope of L. The former are called de re, and the later de dice. To see the 
‘nature of this distinction, consider the parts of (ii) above. Its consequent, 3xL6x, 
says that there isa thing (in Latin, a res) and conceming this thing (de re), ft, i.e, 
that very same thing, is 6 in every accessible world. By contrast, (i)’s antecedent, 
L3x@x, does not carry this implication. It says, conceming the proposition (dictum) 
that something is 6, that this proposition is a necessary truth, ie., that in every 
accessible world something (but not necessarily the same thing in each word) is 6. 
(Don’t worry whether the Latin descriptions are really accurate; what matters is the 
distinction itself) 

De re formulas, containing quantification into the scope of modal operators, are 
often thought to be philosophically more problematic than de dicto expressions 
(Quine, 1933). Hence, it would be convenient if they could be eliminated, oF 
rather, shown always to be equivalent to de dicto wf. This, however, is not the case. 
In particular, 3xL.9xis not equivalent to any de dicto wif even in SS + BF, and so not 
in any weaker system (Tichy, 1973). 
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7.22. Validity without the Barcan formula 


‘The Barcan formula is valid on the definition of validity given above for modal 
predicate logics. As noted, this formula is a theorem of some systems LPC +S, €-., 
those containing system B, but not of others. How, then, shall one give an account 
Of validity for those systems lacking the Barcan formula? 

This question becomes all the more important as a number of philosophical 
objections have been raised against this formula.* The basic Barcan formula is the 
wit 


Vxloe 3 Lyx 


Under the standard interpretation this means that if everything necessarily possesses 
a certain property @, then it is necessarily the case that everything possesses that 
property. But now, itis sometimes argued, even if everything that actualy exists is 
necessarily 6, this does not preclude the possibility that there might have existed 
some other things which were not 6, and in that case it would not be a necessary 
truth that everything is 6 

‘This objection depends on the assumption that in various possible worlds, not 
merely might objects have different properties from those they have in the actual 
world, but there might even be objects which do not exist in the actual world at all. 
Te looks as though the semantics given above for modal predicate logic implicitly 
denies this assumption since each model has only a single domain of individuals, the 
same for each world. That is what yields the validity of the Barean formula This 
suggests that one might obtain a semantics which does not validate this formula by 
admitting models in which different domains are associated with different worlds. 

Accordingly, for systems LPC +S, which may lack the Barcan formula, define a 
model as a quintuple (W, R, D, Q, V) in which W, R, and D are as before, and Q. 
is a function from members of W to subsets of D. Intuitively, Q(x) - often written 
D, ~ is the set of individuals which exist in w. For now, suppose that these models 
satisfy the inclusion requirement, that if wRw’ then D, C Dy. To evaluate atomic 
formulae Ox in a world w, itis not a requirement that the value assigned to free x be 
a member of D,, but rather each model may determine whether such formulas are 
‘true or false in w (Kripke, 1963b). That is, leave intact the specification that V(@) is 
4 set of m+L-tuples each of the form (1). 5 ty #) FOF iy... 5 My D and we W 
‘without constraints that each #, € D,. Then [Vo], [V~], [Vv], and [VL] remain as 
before, but [VV] now becomes 





[VV] V,(¥xc, »)=1 if Vj(q, )=1 for every walternative p of 1 such that 
p(x) € D,, and 0 otherwise. 


“This has the effect of making the quantifiers range over just the individuals that exist 
in the world w. Furthermore, importantly, the definition of validity is modified t0 
say that a wf is valid in a model (W, R, D, Q, V) iff V,(@, ») =1 for every we W. 
and every assignment yt such that w(x) € D, for every variable x. 
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Under this specification every theorem of LPC+ is valid, but the Barcan for- 
‘mula, BF, isnot always. (Given the inclusion requirement, if relation R is symmetric, 
then if wRw’ then D, = D,, and then BE will be valid. This reflects the provability of 
BF in the systems LPC + B and LPC + $5.) The systems LPC + 8, where $=K, D, 
TT, of S4 (and a number of others) lacking the Barcan formula may also be proved 
complete using the methods described earlier. 

Models for systems that do contain the Barcan formula are obviously a special case 
‘of models for systems without it since a mode! will satisfy BF if Qi.) = D for every 
1 € W. So if one wants the quantifiers in each world to range only over the things 
that exist in that world, and one doesn't belicve that the same things exist in every 
world, one would probably not want the Barcan formula, But one would probably 
not want its converse, CBE, either. For consider 


Ixgx > Wal ox (ray 


which is an instance of CBF. It could happen that in every world everything which 
‘exists in that world is 6, but that something in our word fails to be @ in some other 
‘world, That other world will, of course, be a workd in which the object in question 
does not exist 

CRF js, however, valid on the account given so far; itis valid because of the 
inclusion requirement. But since the reason for rejecting CBF requires violating that 
condition, one might consider abandoning it so as to avoid both the Barcan formula 
and its converse. This will raise problems of its own, however (as one should expect, 
since CBF is provable in LPC + $ without extra assumptions). In particular, the rule 
Of necessitation will no longer preserve validity. Although 


YeorD oy (72) 
is valid, 

Livxdx> 69) (73) 
is not 


‘One way to preserve necesstation is to return to the original definition of validity 
in a model, placing no restrictions on j. This, however, makes (7.2) itself no longer 
valid, for if everything in every world is 6 except for some » € D, then V,(Vxbx > 4, 
1») =0 if W(y) = The problem with (7.2) is that y might be assigned something, 
which does not exist (in the world), while the quantifiers are restricted to things that 
do. The restriction on 41 was designed to exclude such cases. This suggests that one 
might make the restriction explicit by adding a predicate, E, for existence (in a 
‘world). This has the semantics: 


[VE] (uw, ») EV(E) iff WED, 


‘Then although (7.2) fails to be valid, its counterpart 
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(Vxte B) 3 oy (72) 


is valid. And more generally, although the postulate V1 (given on page 148) will not 
be valid, it may be replaced by 


VIE (Wxeen By) > aly/x] 
(WIE is the standard replacement for V1 in what is called free lagic [see chapter 
1p? 

7.2.3. Axiomatization of systems with an existence predicate 
Replacing V1 with VIE requires some other changes to the LPC basis of these 
systems. In particular, one cannot easily use V2 as it stands, and so it will be helpful 


to state the basis for these systems explicitly. Where $ is any normal propositional 
‘modal logic, LPCE + S is defined as follows: 


s ‘Any LPC substitution-instance of a theorem of $ is an axiom of 
LPCE +. 

VIE Where x and y are any individual variables, and at is any wif then 
(¥xa 4 Ey) Dal y/s] is an axiom of LPCE +S. 

v ‘Wx(qe > B) > (¥xee > WxB) is an axiom of LPCE + S, where a and B 
are any wf and is any variable 

va (= Veer is an axiom of LPCE + S, provided x is not free in ct. 

UE \WxBx is an axiom of LICE +S. 

UG If ais a theorem of LPCE +, then Vat is a theorem of LPCE + 8. 


UGLY" Ifa, 3 L(a, > +++ > L(a, > LB)--+ )isa theorem of LPCE +8, and 
‘sis not fre im Gy then @ > Hay D+ > La, > LV...) 
is a theorem of LPCE +. 


Such systems LPCE + S, lacking both the Barcan formula and its converse, may then 
be shown to be sound and complete by the methods previously described." 


7.24. Kripke-style systems 


Kripke (1963b) advocated a different way of axiomatizing systems containing 
neither BF nor CBF. Instead of introducing an existence predicate, he restricted the 
theorems of the systems to their universal closures, since even though (7.2) is not 
valid, its closure, Vy(Vxéx> 69), is. This version of LPC is called “LPCK,’ for 
‘Kripke-style,” although the axiomatization here is not exactly the same as Kripke’s. 
Where § is any normal system of propositional modal logic LPCK+ S is just like 
LPCE +S but without UE and without UGLY", and with WIE replaced by 
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VIK Where x, y and = are any individual variables, and a is any wif then 
\Wys(Vxa > al y/x}) is an axiom of LPCK +S. 


(The presence of Vs in VIK is needed to prove V¥/ya1 > Vy¥/xaL.) The semantics is 
as for LPCE +S except that V(E) is no longer required. Completeness may then be 
proved for many of the Kripke-style systems, though there are some limitations since 
it is not obvious how to deal with systems containing B. 


7.2.5. Identity in modal predicate lagic 


Now, suppose one adds a predicate ¢., or more familiarly =", for identity to the 
language for modal predicate logic. In keeping with its intended interpretation, 
stipulate that in all models V(@,) isthe set of triples (u, #, ») for wE D, and w & W. 
With this interpretation an. axiomatic basis for systems $+1 may be provided by 
adding to LPC+S the axiom schemes 





No xex 
2 x=yD(a>8) 


where a and 8 differ only in that a has free x in 0 or more places where B has 
fice y 
‘These postulates yield the theorem 


Ll xeyDLe=y 


by taking a as Le=.cand B as Lx= y. Further, some systems, all extensions of 
B, also contain the analogous principle for non identities. 


INI x#yDLeey 


Intuitively LI and LNI seem to stand or fll together, and so if a modal system 
contains LI it should also contain LNI. Both, afterall, are valid under the inter- 
pretation given above for 6.. Systems § + LNI add LNI to $+, which, of course, 
already contains LI. (As noted, LNI may already be a theorem in some S +1 sys 
tems.) These systems may be proved sound and complete. More generally, when 
is consistent, $+LNI is sound, and completeness holds for all systems $+ LNI 
whose canonical model is based on a frame for S. These results also apply to systems 
without BF and to systems with an existence predicate. 

LI is a matter of some controversy, however. It seems to say that all identity 
statements are necessary, ic., whenever x and y are the same object, then it is a 
necessary truth that they are. Now it seems easy to think of counterexamples to this. 
Ex, the sentence: 





‘The person who lives next door is the mayor. (74) 
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seems to assert an identity between the person who lives next door and the mayor; 
iffs0, one could rewrite (7.4), semi-formally, as: 


‘The person who lives next door = the mayor. 


Yet surely, one would think, this is contingent, for it is logically possible that the 
person who lives next door is not the mayor.” Notice, though, that (7.4) is not 
stated using variables; rather, it uses the definite descriptions ‘the person next door’ 
and ‘the mayor’. This suggests applying Russell's (1905) theory of descriptions to 
cases like (7.4). Let 9 be the predicate ‘lives next door’ and let w be ‘is the mayor.” 
‘On Russell's account, (7.4) becomes 


Bho n Bawa WV yl ($4 A wy) D = 9) (75) 


(3!x6x means ‘exactly one xis 4°, which may be defined as 3¥¥x(@x = x= y).) Now 
(7.5) is true but not necessarily true, so putting Lin front of the whole conjunction 
ssives a false sentence, But LI does not license that move from (7.5); it only allows 
the move to 


FB xox a Fixx a Va y((Gx A yy) D Lx= y) (7.6) 


and this results in no problems of interpretation, 


7.2.6. Contingent identity 


Russel’s theory gives one way in which the classical view of identity in modal 
predicate logic, with both LI and LNI, can accommodate such apparent 
‘counterexamples as (7.4). Nevertheless, LI and LNI might sil be thought unintuitive, 
and so one should see whether one can adapt the semantics to avoid having them as 
valid. Consider the status of LI with respect t0 an assignment jt to individual 
variables, One can falsify LI only if jis allowed to give variables different values in 
different worlds. In (7.5), for example, letting x stand for “the person next door’ and 
_ystand for ‘the mayor,” would mean requiring j1 fo assign to xin a world w whoever 
it is who in w lives next door and assign to y whoever itis who in w is the mayor. 
"Thinking in this way, then, of course, (x) and (3) may coincide in w but not in w’, 
More generally, now suppose that an assignment yt determines a value in D for x at 
1, n(x, w) ED, and modify [V6] to 


[VO] VylO8,- te w=) if Gm, w),.-- Hit, ), PE VO), and O 
otherwise. 


1L1 is not valid on such a semantics, as desired, and hence neither is 12, although all 
instances of 12 in which « and B contain no modal operators remain valid. 

Systems with 12 weakened in this way are called contingent identity ystems, oF 
systems $ + Cl; in them 12 is replaced by 
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IY x=yD (Om ON-- Hd 


Where each s, for 1 = # m either is the very same variable as y, or else x, is x and 
_his » Nevertheless, such systems do not correspond to the semantics just described, 
for this semantics validates the schema: 


Ava 3 eka (77) 


which is not derivable in any of the contingent identity systems (since these are 
‘weaker than the LNI systems and (7.7) is not derivable in any of those). 

‘This formula (7.7) has been encountered before, and it was noted then that it 
does not seem valid. To adapt an example given by Quine (1953, p. 148), in certain 
‘games, itis necessary that some player will win, but there is no individual player who 
is bound to win. One might, however, make (7.7) sound plausible by taking. an 
‘expression such as ‘the winner” as, in a sense, standing for a single object, though it 
‘would be one which, in a more usual sense of ‘object,’ would be one object in one 
siruation but a different one in another. In that case, if itis necessary that someone 
will win, then there is someone who is bound to win, namely, the winner. Such 
so-called objects are often called intensional objects or individual concepts, and the 
interpretation now being considered would thus seem to provide a semantics for a 
logic in which the individual-varables range over intensional objects. 

(On this view of object as intensional object, the semantics seems to allow one to 
make an object out of any string of members of D whatever. For example, suppose 
there are two worlds, a, and a, then where m and 1; are members of D it seems 
‘one is entitled to make up the object which is win a, and win w,. Viewed in this 
way, the LNI systems might be considered as requiring that the only strings of 
members of D which count as objects are strings consisting of the same member of 
D iin cach word (i.e. the only objects recognized in these systems are the straight: 
forward members of D themselves). This suggests that an adequate semantics for 
‘contingent identity systems, $ + C1, which lack schema (7.7), would neither require 
that only strings consisting of a single member of D count as objects, nor allow that 
any string whatsoever of members of D should count. This may be done by having. 
‘each model stipulate a set of ‘allowable’ intensional objects (Parks, 1974). This will 
yield soundness and completeness for systems of contingent identity. 

If the quantifiers range over all intensional objects, then, as noted, (7.7) 
LAx¢x 3 AxL¢x becomes valid. This raises the question of what axiomatic systems 
are correct for this fll logic of intensional objects based on an underlying propositional 
system §. The answer is that for most choices of S, including all the systems 
discussed here, except for $5, the logic of intensional objects based on Sis, 
tunaxiomatizable (Garson, 1984, section 3; Thomason, 1970b).. 





‘Suggested further reading, 


See Hughes and Creswell (1996) for 2 comprehensive introduction to the topics discussed 
here, and other topics in modal logic, presented in much the same style as this chapter. For 
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propositional modal logic, Bll and Segerberg (1984) is rich source, while Chagroy and 
Zakharyaschey (1997) presets a very thorough, but more advanced mathematical teatmen 
All three of these have extensive bibliographies of primary work in the fell. Chellss (1980) 
Presents another convenient intradacion to propositional modal logic. For the properties of 
fit-order moxa logics, Garson (1984) presents a helpful overview. And for a discussion of 
some of the philosophical issues concerning quantification and modality, se Coccharella 





Notes 


1 This kind of account of validity appeared in the late 1980s and eaty 1960s, It is ao: 
ciated especialy with the work of Kaphe, ¢., (1959), (1963), and so our fames are 
often calle “Kripke frames’, but similar Seas appear in the work of others around hat 
time, ¢.g., Kanger (1957), Bayart (1958), Montague (1960) and Hintikka (1961). Ear- 
lier, elated accounts may be sen in Waberg (1933), McKinsey (1945), Canap (1946), 
among other. Jonson and Tarski (1951) prescts an algebraic description of this notion 
‘of vali, though the connection with modal logic was aot made in that article. 

2. Another related method commonly used to demonstrate the finite model property isthe 
‘method of rations, see Hughes and Cresewell (1984, pp, 13645), Bull and Segerberg, 
(1984, pp. 438), or Chagrov and Zabharyaschey (1997, pp. 140), 

13 Thin wyatem i called W by Segerberg (1971, p. 84), though itis abo widely called G afer 
Gédel since it has been studied as the modal logic of ‘provability’, ¢.g., by Boolos (1979), 
For a more recent survey of the history of provability logic, see Boolos and Sambin 
(1990), The system dates atleast from Lob (1966), 

4 For demonstration ofthese and other reults, se, for example, Hughes and Creswell 
(1996, pe 2), and the references cited there. See alo Bull and Scgerberg (1988) and 
‘Chagrov and Zakharyaschey (1997). 

5 Its noteworthy that completeness in modal LPC: is sometimes more dificult to achieve 
than in modal propositional logic. For instance, hough $4.2 (= 84 + Mp > LMp) i 
characterized by frames which are reflexive, transitive and convergent, and all its frames 
have these propertcs, its Bit-order counterpart, $4.2 + BE, isnot characterized by any 
clas of frames. (The incompleicnen of $4.2 + BF is sated, without proof, in Shchtman 
tnd Skvoreoe (1991); Creswell (1995) presents proof based on the fact that 





Meg Va(Ge> Lye) 4 L~Vewe) « MV a(Oey Lye) 4 Vai Me L(BeOe3 $8) 


{is not satisfiable on any convergent frame but is consistent in $4.2.) 

6 See Prior (1957, pp. 26-8 pani), also Hintikka (1961), and Myhill (1958, p. 80). For 
a defence of the formula, see Barcan (1962, pp. 88-90) and Cresewell (1991), 

7 See Garwon (1984, p. 261), An existence predicate is introduced by Rexcher (1959), and 
also assumed by Fine (1978). 

8 Sce Thomason (19703) though, who uses an existence predicate defined in terms of 
identity (p. 57). The rules UGLY" frst appear in that paper 

9 Examples like this date ar least from Frege (1982). Quine (1947), however, frst called 
attention to the problems they raise for modal predicate logic, and the literature since 
then has flourished, We shall not enter that philosophical discussion here, but only 
describe briefly some different options one might take for interpreting modal systems. 
Li is derived asa theorem in Barcan (1947). 
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Chapter 8 
Deontic Logic 
Risto Hilpinen 


8.1. Introduction 


Deontic logic is an area of logic which investigates normative concepts, systems of 
norms, and normative reasoning. The word ‘deontic’ is derived from the Greek 
expression ‘déon’, which means “what is binding’ or ‘proper’. Thus, Jeremy Bentham 
(1983) used the word ‘deontology’ for “the science of morality,” and the Austrian 
philosopher Ernst Mally (1926), who developed in the 1920s a system of the 
‘fundamental principles the logic of ought,” called his theory “Deontik’. Normative 
concepts include the concepts of obligation (ought), permission (may), prohibition 
(may nat), and related notions, such as the concept of right. Systems of deontic logic 
contain, in addition to the usual sentential connectives and quantifiers, logial con- 
stants which represent some of these normative concepts 

‘Much of the recent work on deontic logic has been based on the view that 
eontic logic is a branch of modal logic [see chapter 7], and that the concepts of 
obligation, permission, and prohibition are related to each other in the same way a8 
the alethic modalities necesity, pasibility and impossibility. This view goes back to 
medieval philosophy; some fourteenth-century philosophers observed the analogies 
between deontic and alethic modalities, and studied the deontic (normative) inter: 
pretations of various laws of modal logic. In the same way, Leibniz (1930) called the 
conti categories of the obligatory, the permitted and the prohibited “egal modalities’ 
(Curis modalia), and observed that the basic principles of modal logic hold for the 
legal modalities. Infact, Leibniz suggested that deontic modalities can be defined in 
terms of the alethic modalities; according to him, the permitted (liitwm) is 


‘what is posible for a good man 10 do, 
and the obligatory (debizum) is 


what is necessary for a good man t0 do 
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“The contemporary development of deontic logic since the publication of von Wright’s 
(1957 [1951]) pioneering paper “Deontic Logic” has been based on the study of 
the analogies between normative and alethic modalities. 


8.2. The Standard System of Deontic Logic (SDL) 


A simple system of deontic logic can be obtained by reading Leibni2’s definition of 
the concept of obligation (ought) as 


(O.Leibniz,) pis obligatory for a iff (if and only if) p is necessary for a's 
‘being a good person 


that is, 
(OLeibniz,) — O,p iff NG(a) > p) 


where *N’ isthe alethic necessity operator and ‘G(a)" means that a is ‘good” (in the 
sense intended by Leibniz). Deleting the explicit reference to an agent gives the 
following definition of the concept of ought: 


(OLeibniz,) — Op= N(G> p) 


‘The corresponding Leibnizian concept of permission (or the concept of may) is 
‘expressed by 


(PLeibniz,) Pp MIG & p) 


(where ‘A? is the operator for alethic possibility). These schemata can be regarded 
as partial reductions of deontic logic to ‘ordinary’ (alethic) modal logic. The Leibnizian 
analysis of the concepts of obligation and permission was rediscovered by the Swed 
ish philosopher Kanger in 1950, who interpreted the constant G as ‘what morality 
prescribes’ (Kanger, 1981 [1957]). According to this interpretation, Op (it ought to 
bbe the case that p) means that p follows from the requirements of morality. Anderson 
(1967 [1986}) put forward a reduction schema equivalent to Kanger’s, 


(08) Op= N-p>5) 


where § may be taken to mean the threat of a sanction or simply the proposition 
that the requirements of law or morality have been violated. 

Ifthe alethic N-operator satisfies the axioms of the modal logic T (Chellas, 1980, 
p. 131) for see chapter 7}, viz. 


(K)Mp>4)>(Np2 No) 
(Tt) NpD>p 
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and the modal ‘rule of necesstation’ 
(RN) If pis provable, Np is provable, or briefly, p/Np 


itis easy to see that the ought-operator defined by (O.Leibniz,) satisfies the deontic 
K principle 


(Kp) O(p> 4) 3(Op> On) 


and the rule of ‘deontic necessitation’ 


(RNo) 9/0 
‘The additional assumption that being good is possible, 
(De) MG 


yields the principle of deontic consistency 


(Dy) Op> Pp 
where *P" represents the concept of permission, definable in terms of 10° by 
(P) Pps O-p 


Similarly, the concept of prohibition, F, is defined by 
(F) p= O-p 


where a state of affairs p is prohibited iff not-p is obligatory. The system of 
(propositional) deontic logic obtained by adding to propositional logic the axioms 
(or axiom schemata) Ky and Dy and the rule RNo is usually called the ‘standard 
system of deontic logic’ (SDL). Among its theorems are: 


O(p & 9) (Op & Og) (Conjunctive distributivity of O) (8.1) 
Op & 042 O(p & 4) _(Aauregation principle for O) (82) 
p> O(pv 9) (8.3) 
(p> 9) Rp > Pa) (84) 
p> Pov) (85) 
Pipv a) (Fev Py) (Disjunctive distibutivty of P) (8.6) 
Pip & 9) Pp (8.7) 


while the rules of inference 
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(RM) p> 4/0p> Of 
(REs) p= 4/0p= Oy 


are derivable. On the basis of the axioms Ky and Dy, this system may be called the 
system KD, or simply D; it is a member of the family of normal modal logics, all 
of which contain (a counterpart of) the rule RN; [see chapter 7] (Chellas, 1980, 
p. 114). 


8.3. The Semantics of the Standard Deontic Logic 


‘The sentences of SDL can be interpreted in terms of possible worlds (or world 
states) in the same way a8 other normal modalities. A possible worlds’ interpretation 
of SDL is a triple M=(W, I, R), where Wis a universe of possible worlds, J is an 
interpretation function which assigns to each sentence a subset of W, ic., the worlds 
1 W where the sentence is true; the truth of p at w under Af is expressed *M, 
uk p. or briefly ‘wp If pis not true at wit is false at m. R is a 2-place relation on 
W, called the relation of deontic alternativeness. The interpretation function assigns 
cach sentence a truth value at each possible world. A sentence is called vai (Logi 
cally true) iff itis true at every world » € W for any interpretation Mf, and is a 
logical consequence of p iff there is no interpretation M and world w such that M, 
mb pand not M, wg. The interpretation function is subject to the usual Boolean 
conditions which ensure that the truth:-functional compounds of simple sentences 
receive appropriate truth-values at each possible world. The alternativeness relation 
Ris needed for the interpretation of sentences involving the deontic operators. In 
the semantics of modal logic, necessary truth at a given world w is understood as 
truth at all worlds which are possible relative to w or alternatines to u, and possibility 
‘at u means truth at some altemative to m. For the concepts of obligation (or ought) 
and permission (may), these conditions can be formulated as follows: 


(CO) WE Opiff rE p for every vE Wsuch that Rw, ») 
(CP) Wk Pp iff» p for some » E W such that RU, ») 


For the axiom Dj, to be valid, itis necessary to regard as a serial relation, in other 
words, 


(CD) For every mE W, there is a 7 Wsuch that Riu, ») 

Different further assumptions about the structural properties of the R-relation vali- 
date different deontic principles, and lead to different systems of deontic logic. For 
ccxample, it is clear that 


Op>e (8.8) 
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is not a logical truth, and therefore R cannot be assumed to be a reflexive relation, 
but the principle 


0p>») (89) 


seems a valid principle of deontic logic: It ought to be the case that whatever ought 
10 be the case is the case. The validity of (8.9) follows from the assumption that R. 
is secondarily reflexive, in other words, 


(C.00) If Riw, ») for some w, then R(x, 7). 


((8.9) is not derivable in SDL, but it can be added as an additional axiom. It is 
derivable from the Kanger-Anderson reduction in alethic modal logic if that logic 
contains T.) 

‘The semantics sketched above, due initially to Hintikka (1957, 1981) and Kanger 
(1981 [1957]), may be termed the ‘standard semantics’ of deontic logic. It gives an 
intuitively plausible account of the meanings of simple deontic sentences when the 
dcontic alternatives to a given world w are taken to be worlds (or situations) in 
which everything that is obligatory at w is the case; they are worlds in which all 
obligations are fulfilled. Hence, the worlds related to a given world by R may be 
termed deontically perfect oF ideal worlds (relative to x). If possible worlds are 
regarded as possible courses of events or histories which are partly constituted by an 
agent's actions, the semantics of SDL simply divides such histories into deontically 
acceptable and deontically unacceptable histories. An action is permitted iff i is part 
of some deontically acceptable course of events of if there is some deontically 
acceptable way of performing the action, and an action is obligatory iff no course of 
vents is acceptable unless it exemplifies the action in question. The set of acceptable 
courses of action (relative to a given action situation) may be termed the field of 
jpermissbilty (Lewis, 1979). According, to the deontic consistency principle (CD), 
the field of permissibiliy is never empty; some action is permissible in any situation, 


8.4. Problems and Paradoxes 


SDL, like any logical system designed for certain applications, faces two kinds of 
problems: 


(a) Problems of interpretation and application: 
How should the deontic operators O, P, and F and the non-logical 
(propositional) symbols p, 4, r,..., of the system be interpreted, and how 
should the metalogical and semantic concepts truth, ralidity, and logical con- 
sequence be understood in this context? 

(b) Problems about the adequacy of the formalization of normative reasoning 
provided by the standard system: 

Docs SDL give an adequate and correct account of the logical relationships 
among norms or normative propositions? 
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‘These questions (or classes of questions) are obviously interrelated; the adequacy of 
a system of deontic logic depends on its interpretation. Both questions have been 
discussed extensively in the recent literature. 

Deontic logic is usually defined as the logic of the basic normative concepts or, 
‘more generally, asthe logic of normative or prescriptive discourse. This characteriza- 
tion gives rise to an interesting question about the metalogical concepts of validity 
and logical consequence in deontic logic. These concepts were defined above in the 
standard way in terms of the concept of truth, but norms and directives cannot be 
said to be true or false in the same sense a8 statements and assertions, and therefore 
the standard concepts of validity and logical consequence, familiar from the logic of 
descriptive or assertoric discourse, seem inapplicable to the logic of normative dis 
course. The Danish philosopher Jorgensen presented this observation in the 1930s 
as an objection to the very possibility of the logic of imperatives (commands). Since 
imperatives are not true or false, it does not, strictly speaking, make sense to speak 
about the logic of imperatives. Norms are, in this respect, analogous to imperatives. 
(On the other hand, as Jorgensen (1937/8) observed, it scems clear that directives or 
imperatives can be inferred from other directives or that two directives can be 
logically inconsistent. This difficulty is called Jorgensen’s dilemma; Makinson (1999) 
describes it as “the fundamental problem of deontic logic.” 

Many philosophers have proposed to solve this problem by making a distinction 
between two uses of norm sentences; they can be used for expressing norms or 
directives and for making normative statements (statements about norms). The 
latter are descriptive statements which state that something is obligatory, permitted 
(or prohibited according to a given system of norms (Bulygin, 1982). For example, 
the deontic sentence 


“Motor vehicles ought to use the right-hand side of a road, 


can be regarded as a directive addressed to drivers, or as a statement which gives 
information about the trafic code of some (unspecified) country. If tis regarded as 
a statement about the U.S. traffic regulations, itis a true statement; understood as a 
statement about the U.K. regulations, i i false. Normative statements, unlike the 
horms themselves, are true or false, and the logical relationships among normative 
statements can therefore be understood in the usual way. If deontic logic is regarded 
as the logic of normative statements, Jorgensen’s problem does not arise. 

This way out of Jorgensen’s difficulty does not mean that the prescriptive or 
genuinely normative use of deontic sentences is not subject to logical laws, because 
the distinction between the normative and the descriptive use of deontic sentences 
can be understood as two ways of using normative statements (which are true 
or false). As Kamp (1973/4, 1979) has pointed out, a normative sentence, like the 
above 





‘Motor vehicles ought to use the right-hand side of the road. 


(which is true or false), can be used oF uttered performatively, to create or sustain a 
norm, of assertorically, to describe an independently existing norm system. In the 
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former case, the utterance of the statement in the appropriate circumstances (by a 
proper norm authority) has normative force, and is sufficient to make the statement 
true; in the latter case, the truth of the statement depends on whether it fits a norm 
system whose content is independent of the utterance in question. Thus the pre- 
scriptive-descriptive distinction coincides with the distinction between two uses of 
deontic statements, the performative use and the assertoric use; and, in both cases, 
the statements in question can be regarded as truc or false. Consequently, the 
concepts of validity and logical consequence can be defined in deontic logic in terms 
Cf the concept of truth in the same way as in other areas of logic. According to 
Kamp, the asertoric use of deontic sentences should depend on their performative 
use. In their performative use, the function of O- and Esentences (obligation and 
prohibition sentences) is to restrict the range of normatively acceptable options 
available to an agent (the addressee), whereas permission-sentences have the opposite 
effec; they enlarge the set of deontically acceptable action possibilities. For example, 
Kamp has put forward the following principle conceming the performative and 
asscrtoric uses of permission sentences: 


(PP) An assertoric utterance of a permission sentence Ps in a context cis true 
iff all those worlds already belong to the options of the agent that a 
performative use of Ps would have added to the set of the agent's 
‘options if they had not already belonged ro it. 


Kamp has also observed that its not always clear whether a deontic sentence is used 
performative or assertoically. However, ifthe assertoric use of deontic sentences is 
governed by (PP) (and by analogous principles for ought-sentences and prohibition: 
sentences), assertoric utterances of deontic sentences can guide and direct the agent's 
actions in the same way as their performative utterances. For example, in the case of 
4 permission sentence, “ether the utterance is a performative and creates a number 
of new options, or else it isan assertion; but then if it really is appropriate it must be 
true; and its truth then guarantees that these very sime options already exist” 
(Kamp, 1979, p, 264). The practical consequences of the utterance for the addressee 
are the same in both cases 

According to SDL, deontic logic is a branch of modal logic, and many principles 
‘of deontic logic are special cases of more general modal principles. This approach to 
<deontic logic has sometimes been criticized on the ground that i ignores or misrep- 
resents many significant features of normative discourse which distinguish it from 
Cother varietis of modal discourse. It has been argued that some principles of SDL, 
including some of the principles (8.1)~(8.7) listed above, lead to paradoxes and are 
therefore unacceptable. For example, some philosophers have felt that there is some- 
thing paradoxical about the formula (8.3), which says that if it ought to be the case 
that p, then it ought to be the case that pv 4. (8.3) authorizes, ¢g., the inference 
from the directive (8.10) to (8.11): 


Peter ought to mail a letter. (8.10) 
Peter ought to mail a letter or burn it (aul) 





165 





Risto Hilpinen 


which seems to some an unacceptable inference. This is known as Ross’s paradox, 
originally due to the Danish philosopher Alf Ross (1941). A somewhat similar 
(putative) paradox depends on principle (8.5), according to which the permisibility 
of p entails the permissbilty of p v q (for any 4); for example, according to (8.5), 
(8.12) entails (8.13) 


Peter may drink water, (8.12) 
Peter may drink water or drink whisky. (8.13) 


which also seems counter-intuitive, These inferences are of course valid if sentences 
(8.10)-(8.13) are understood in terms of the possible worlds semantics outlined 
above. If Peter mails a leter in all deontically perfect situations, then he mails a 
letter or burns it in all such situations, and if Peter drinks water in some deontically 
satisfactory situation, then he drinks water or whisky in some such situation. But this 
may be taken as evidence that the semantics of SDL fails to do justice to significant 
features of normative discourse. 

“The inferences in question may seem especially paradoxical ifthe sentences (8.10)~ 
(8.13) are thought of as being used performatively or normatively, ic, if they are 
used for issuing a norm or a permission and not merely for describing the content of 
a system of norms. It is obvious that the effects of a normative utterance of (8.11) 
are not the same as the effects of (8.10), unlike (8.10), (8.11) does not suffice to 
make the action of posting the letter obligatory or required (for Peter). (8.10) 
‘excludes more possibilities (restricts the field of permissibility more) than (8.11). In 
the same way, the normative effects of a performative utterance of (8.13) are not the 
same as those of (8.12). IF (8.12) is used performatively, it opens (makes permitted) 
some possibilities in which Peter drinks water, but (8.13) opens a more vaguely 
defined set of possibilities, namely, some possibilities where Peter drinks water or 
‘whisky, ‘The claim that the inference of (8.11) and (8.13) from (8.10) and (8.12) is 
paradoxical or unacceptable seems to be tacitly based on the following principle: 


(Ine) If a norm (permission) N, entails N;, then the normative (performa- 
tive) effects of N; entail the effects of N;. 


But this principle is obviously false; logical deduction should not be expected to 
preserve the effects of a norm on a norm system any more than logical deduction 
preserves the effects of the acceptance of a declarative statement on a person's belief 
system (or corpus of knowledge). The effects of the acceptance of a disjunctive 
proposition on a person’s belief system are quite different, and usually less signif 
cant, than the effects of the acceptance of one of the disjuncts; a disjunctive belief 
adds less content to a belief system than either of the disjuncts. 

‘The apparent paradox related to disjunctive permissions scems more interesting, 
than Ross's paradox. According to the SDL, the normative use of (8.13) should 
‘make acceptable some worlds (or situations) in which Peter drinks water or whisky. 
This can be accomplished by allowing some situations in which Peter drinks 
water. However, a normative utterance of (8.13) is normally taken to permit some 
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situations in which Peter drinks whisky as well as some situations in which Peter 
drinks water; in other words, (8.13) usually seems to have the same effect as the 
ttterance of the conjunction 


Peter may drink water and Peter may drink whisky. (8.4) 


A disjunctive permission seems to offer a choice between the two disjunects and thus 
entail a conjunction of two permissions. This feature of disjunctive permissions 
cannot be explained on the basis of SDL alone, but depends on some pragmatic 
features of disjunctive permissions. However, a disjunctive permission does not 
necessarily permit both disjuncts, but may leave the determination of the field of 
permissibility partly open, as in the case of the statement (Kamp, 1979, p. 271) 


Yes, you may drink water or whisky, but you have to consult your doctor before 
you drink whisky. (8.15) 


(8.15) may be an instance of « normative (performative) use of a permission sen- 
tence; a norm authority makes a disjunctive action permitted, but refers to another 
authority for the determination of the permissbility of one of the disjunets. This 
suggests that the principle 


Plpv 9) > Pp & Py (8.16) 


should not be regarded as a general principle for the concept of permission. 

‘Another much discussed paradox is related to rule (RMp). If q is a logical 
consequence of p, then, according to (RM), Op entails Og. Since knowing that p 
centails the truth of p, 


OK.p> Op (8.17) 


is a valid formula, where *K.p" means that # knows that p. For example, if Gladys, 
who is a firefighter, ought to know that there is a fire, then, according to (8.17), 
there ought to be a fire, which is quite counter-intuitive (Aqvist, 1967). In other 
‘words, according to (8.17), the following statements cannot be all true: 


p> OK» (8.18) 
’ (8:19) 
oO» (8.20) 


But a situation in which there is a fre, (8.18)+{8.20) seem all true; if there is a 
fire, Gladys ought to know it, but there ought not to be a fire. Some philosophers 
have regarded this paradox (the ought-to-know paradox or the paradox of epistemic 
obligation) and other similar paradoxes as evidence that (RMp) is not a valid principle 
‘of deontic logic (Goble, 1991). 
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8.5. Conditional Norms 


In the example above, (8.18) expresses a conditional obligation: Gladys ought to 
know that there is a fire if there is one, not otherwise. As was observed above, in the 
semantics of SDL, the interpretation of deontic sentences is based a division of 
possible worlds or situations into ‘deontically perfect” or normatively faultless worlds 
and normatively unacceptable or imperfect worlds. Systems of conditional norms 
(conditional obligations) are often semantically more complex, and an attempt to 
formalize them in SDL is apt to lead to paradoxes. Chisholm (1963) has given an 
‘example of such a set: The following sentences scem jointly consistent and pairwise 
logically independent: 


(Chi) Jones ought to go to help his neighbors, 


(Ch2) Jones ought to tell his neighbors he is coming if he is going to help 
them. 


(Ch3) If Jones does not go to help his neighbors, he ought not to tell them 
he is coming. 


(Ch4) Jones does not go to help his neighbors. 


In the language of SDL, these sentences might be expressed as follows: 


on (8.21) 
Oh 1) (8.22) 
ab> On (8.23) 
ob (8.24) 


where J says that Jones goes to help his neighbors, and ¢ says that Jones tells his 
neighbors that he is coming. According to SDL, (8.21) and (8.22) entail 


Or (8.25) 
and (8.23) and (8.24) entail 
On (8.26) 


by propositional logic, in other words, (8.21)-+(8.24) entail 


Ot & Ont (8.27) 
and according to the consistency principle (Dp), (8.27) entails 
Or & Or (8.28) 
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‘Thus the suggested interpretation of (Chl)-{Ch4) makes them jointly inconsistent. 
This seems intuitively unsatisfactory; (Ch1)-(Ch3) seem a reasonable and consistent 
set of requirements, and the fact that Jones does not go to help his neighbors 
should not make them jointly inconsistent. 

It may be suggested that there is an unjustified logical asymmetry between (8.22) 
and (8.23); in (8.22), the O-operator precedes >, but, in (8.23), their order is 
reversed. The corresponding asymmetry between (Ch2) and (Ch3) does not seem to 
bbe semantically significant, If (Ch2) is represented by 


b> Or (8.29) 
‘or (Ch3) is formalized as 
Os 2-2) (8.30) 


the contradiction is avoided; (8.21) and (8.29) do not entail (8.25), and (8.24) and 
(8.30) do not ental (8.26). However, according to SDL, (8.29) is a logical con- 
sequence of (8.24), and, on the other hand, (8.30) is a logical consequence of 
(8.21). Both results are intuitively unacceptable; as was noted above, the sentences 
(ChI)-(Ch4) seem to be pairwise logically independent of each other. 

‘Sentence (Ch3) tells what Jones ought to do in a situation where he has failed to 
fulfill his duty to help his neighbors; it expresses a consrary-te-duty (CTD) obliga 
tion. For this reason, Chisholm’s paradox may also be called the paradox of CTD 
obligation. Chisholm’s example shows that systems of norms which contain both 
primary obligations and CTD obligations cannot be formalized in SDL in a satisfac 
tory way. Some authors have proposed to avoid the inconsistency of between (8.25) 
and (8.26) by relativizing the concept of obligation (or the concept of ought) to 
time since, it has been suggested, €.g., by Aqvist and Hoepeiman (1981), (8.25) and 
(8.26) hold at different points of time. However, this does not seem to be an 
‘essential feature of Chisholm’s paradox. There are many non-temporal versions of 
the CTD-paradox, such as the following situation: Assume that dogs are not per 
mitted in a certain village, but if anyone happens to have a dog, there ought to be 
2 waming sign about it in front of the owner's house. Moreover, warning. signs 
‘ought not to be posted without sufficient reason. Thus the following normative 
statements seem to be true: 


(Dsl) There ought to be no dog. 
(Ds2) There ought to be no waming sign if there is no dog. 
(Ds3)__If there is a dog, there ought to be a warning sign. 
(Dst) There is a dog. 

(Ds1)-(Ds4) are formally analogous to Chisholm’s example, and an attempt to 


formalize them in SDL leads to a similar inconsistency (Carmo and Jones, 2001; 
Prakken and Sergot, 1997). 
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‘The deduction of the contradiction (8.27) from (8.25) and (8.26) depends on the 
principle of normative consistency (Dp), Op > —O-p. This principle has been crii- 
cized independently of Chisholm’s example. (Dp) excludes the possibility of norma- 
tive conflicts, but such conflicts are not unusual in morality and law, and it may be 
argued that they do not amount to paradoxes (Chellas, 1974, p. 24; Goble, 1999, 
p. 332). If the consistency principle is rejected, the aggregation principle (8.2), 
Op & Og > O(p & q), should be rejected as well, because the latter principle under- 
‘mines the distinction between confi between obligations and the existence of 
a self-contradictory obligation; the recognition of the possibility of normative con- 
flics does not mean that one should also admit the possibilty of self contradictory 
obligations, Thus logicians have developed systems of deontic logic in which (Dp) 
and the aggregation principle do not hold (Chellas, 1980, pp. 201-2). Nevertheless, 
such systems do not help to give a satisfactory solution to the puzzles about the 
CTD-obligations. They enable one to conclude only that CTD:situations involve 
conflicting obligations without offering any analysis of CTD- obligations and their 
relationship to the ‘primary’ obligations. 

It is not difficult to see why Chisholm’s example cannot be represented in a 
satisfictory way in SDL. As observed above, the semantics of SDL is based on a 
division of worlds of situations into acceptable (deontically perfect) and unaccept 
able worlds, and the O-sentences describe how things are in the deontically perfect 
worlds. But sentence (Ch3) in Chisholm’s example does not tell how things are in a 
dcontically faultless word; it tells what the agent (Jones) ought to do under deontically 
imperfect conditions, i., in situations in which Jones does not act in acordance 
with his duties. (Ch3) is’ contrary-to-duty obligation. The situation could be de 
seribed by saying that among the (less than perfect) worlds where Jones does not 
fulfil his duty to help his neighbors, those in which he does not tell them he is 
coming are preferable to the circumstances where he makes a false promise Thus the 
interpretation of Chisholm’s example seems to require a distinction between differ- 
ent degrees of deontic perfection. 

According to this interpretation, (Ch2) can be taken to mean that, in deontically 
perfect circumstances, where Jones is going to help his neighbors, he tells them that 
he is coming, and (Ch3) says that in the best worlds where he is not going to help 
his neighbors, he does not tell them he is coming (Hansson, 1981, p. 143). Express 
these conditional obligations of by 











Ou) (8.31) 
O(1/-b) (8.32) 


respectively. Call worlds where p is true simply ‘p-worlds’ in other words, w is a p- 
world iff w€ Ki). Let the p-worlds that are normatively least objectionable relative 
to a given situation w be called deontically optimal p-worlds relative tom, briefly, 
Opt(p, 1). The concept of deontically optimal p-world is a generalization of the 
concept of a deontically perfect world of SDL; the (absolute) deontic perfection of 
relative t0 m, 1.¢-, R(as, w), can be taken to mean that w is T-optimal relative to a, 
where T is a logical trath: 
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(COT) Op=xOp/T) 
‘The assumption that for any consistent proposition p, there is a nonempty set of 
deonticaly optimal p-worlds, is a generalization of the principle (CD) of SDL, i. 
the principle that any world has 2 nonempty set of deontic alternatives. The truth of 
the conditional ought-statement O(g/) at w can be taken to mean that qs true in 
all deontically optimal p-worlds (relative to), i¢., 

(CO.cond) wm O(4/p) if q is true in every world w E Opt(p, m) 
If p entails r and p is truc in some roptimal world (relative to 1), the p-optimal 
worlds (relative t0 1) are obviously the optimal worlds where # is truc; in other 
words, the concept of optimality (or relative deontic perfection) is subject to the 
following condition: 


(C.Opt) IF Nip) C Hr) and Kip) 1 Opr( 7, 1) is non-empty, then 
Opp, #) = Hp) A Opr(r, m). 


‘Thus, according to (C.Opt), the truth of 
Op (8.33) 
means that Opt(p, #) = Opti T, #), and 
O1n/P) (8.34) 


‘means that all optimal pworlds are ¢-worlds; hence, according to (C.Opt), (8.33) 
and (8.34) entail 


On (8.35) 
Hence, according to this semantics, the principle of “dcontc detachment* 
(DDet)_O(4/p) 2(Op> 04) 


{s a valid principle for conditional obligations. On the other hand, the principle of 
“factual detachment? 


(FDet) — O(4/p) > (p> Op) 


does not hold. If (Ch2) and (Ch3) are interpreted in this way, (Ch1)(Ch4) do not 
lead to a contradiction; (Ch1) and (Ch2) entail the obligation Or (8.25), but (Ch3) 
and (Ch4) do not entail On (8.26) 

Another possible response to Chisholm’s paradox is the replacement of the truth- 
fanctional conditional in (8.22) and (8.23) by an intensional or subjunctive conditional 
(Mott, 1973) or even a relevant conditional (Goble, 1999) without introducing a 
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special concept of conditional obligation. It has been known since the beginning of 
the twentieth century, indeed, from antiquity, chat conditional statements are usually 
‘not truth-functional. Philosophers have attempted to represent if-then-sentences as 
truth-functional (or ‘materia?’) conditionals for want of a better theory, but the situ: 
ation changed in the early 1970s when David Lewis (1973) and others developed 
intensional theories of conditionals. In the representation of Chisholm’s example in 
SDL, the logical asymmetry between (8.22) and (8.23) is required by the assump- 
tion of the logical independence of (Ch1)-{Ch), and this makes it posible to derive 
the inconsistency (8.25)-(8.26). If the two conditionals are expressed as intensional 
conditionals with a Lewis-type semantics, this problem does not arise. An intensional 
conditional, ¢.g., a subjunctive conditional, *g if p’ can be regarded as true in a 
situation w iff gs true in all possible worlds (situations) in which pis true but which 
resemble w in other respects as mach as possible. The truth of such a conditional is 
‘not a consequence of the falsity of p (or of the truth of q) at 

If the conditional "gif pis symbolized ‘p > 4, and (Ch2) and (Ch3) are represented 
by 


b> OF (8.36) 
ab> On (8.37) 


respectively, no contradiction will arise. If the counterpart of the madus ponens 
principle holds for the conditional connective, i.e. if 


(FDer>) (p> (0 


is logically true, (Ch3) and (Ch) entail (8.26), but (Chl) and (Ch2) do not entail 
(8.25). The former analysis of conditional obligations, as O{4/p), leads in Chisholm's 
‘example to the result that Jones ought to tell his neighbors that he is coming to help 
them, but the second analysis, p > Og, gives the result chat Jones ought not to tell 
his neighbors that he is coming. Thus the two analyses sem to involve two different 
senses of ‘ought’ (or ‘obligation’). The first interpretation of (Ch1)-(Ch4) seems to 
take the statements as expressions of ‘ideal’ or prima facie obligations; (Ch1)-(Ch2) 
‘can be regarded as saying that insofar as Jones ought to help his neighbors, he ought 
to tell them that he is coming — but if he isin fact not going to help his neighbors, 
he has an ‘actual’ or practical obligation not to tell them he is coming. There seems 
to be no logical or deductive connection between the two kinds of ought, but the 
‘existence of an ideal (or prima facie) obligation serves as evidence for the corre- 
sponding practical obligation. The inference of actual obligations from ideal obliga- 
tions is an abduction rather than a deduction. 

Both forms of conditional obligation, O(4/p) and p > Oy, are defeasibe in the sense 
that they do not satisfy the principle of strengthening the condition of the obligation 
‘or strengthening the antecedent of the conditional; in other words, the principles 


O(4/P) > Oa/p & (8.38) 
(p> Og)  ((p & r) > Og) (8.39) 
2 








Deontic Logic 


are not valid. However, according t0 Lewis's (1973) semantics for subjunctive 
conditionals, the counterpart of modus ponens holds for the >-connective, making, 
the detachment of actual obligations from factual premises possible; the Lewis-type 
conditionals are ‘strict,’ albeit only ‘variably strict.” In the recent literature, however, 
‘many authors have analyzed conditional obligations (including CTD-obligations) by 
‘means of defeasible conditionals for which madus ponens does not hold, for example, 
when the conditional p > 4's read ‘Normally, ¢ holds in circumstances p” (Alchourrén, 
1993, p. 75; Makinson, 1993, pp. 363-5). According to this interpretation, (8.24) 
and (8.37) do not entail (8.26) in the standard sense of logical consequence, but 
provide only evidence for it. Different variants of Chishoim’s example and the 
attempts to represent various CTD-obligations and other conditional obligations in 
formal systems of deontic logic have generated an extensive literature on the subject; 
see Carmo and Jones (2001) and the articles in Nute (1997). 


8.6. On the Representation of Actions in Deontic Logic 


Above, the schematic letters p, 4, F, «tC, are propositional symbols; they represent 
propositions. However, in informal normative discourse deontic concepts are usually 
applied to actions rather than propositions. Philosophers have made a distinction 
between two kinds of ought, the ought-to-be (Seinsllen) and the ought-to-do 
(Tunsolten) ~ see, for example, Castaiteda (1972 [1970]) ~ and it has been suggested 
that since the deontic operators of SDL are propositional operators, the standard 
dcontic logic and the extensions and revisions discussed above should be regarded as 
theories of the ought-to-be rather than theories of the ought-to-do. It has been 
argued that in a satisfactory theory of the ought-to-do, deontic operators should be 
construed as action modalities rather than propositional modalities, Deontic concepts, 
‘were understood in this way by Leibniz and by other authors of the seventeenth and 
the cighteenth centuries (Hilpinen, 1993a, pp. 85-6). Von Wright's first system of 
deontic logic (1957 (1951) can be regarded as an attempt to articulate and formal- 
ize this view. In this system, the deontic operators O, P and F are prefixed, not to 
propositional expressions (statements), but to expressions for action-types or, in von, 
Wright's terminology, ‘act-qualifying properties.’ Castafieda (1972 [1970], 1981) 
adopted a similar approach; he stressed the importance of distinguishing action: 
descriptions, or practiions, from propositional expressions. According to Castaneda, 
deontic reasoning is reasoning about practtions (as opposed to propositions which 
describe the conditions or circumstances of action), and deontic operators can be 
applied only to practitions, not to propositions. 

‘Von Wright's and Castaneda’s distinction between propositions and action terms 
(or practtions) has been formalized and developed further in dynamic deontic logic; 
see Czelakowski (1997), Meyer (1988) and Segerberg (1982). In dynamic logic, the 
interpretation of action terms reflects the common philosophical view that an action 
can often be described as the bringing about of a change in the world. According 
to this interpretation, an action transforms a given situation or a workd-state into a 
new state (or keeps it unchanged). For example, in his ‘action-state semantics’ for 
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imperatives, Hamblin (1987) has analyzed actions or deeds in terms of successive 
world-states. Thus the distinction between propositions and action terms is inter- 
preted in the semantics of (dynamic) deontic logic asthe distinction between sets of 
‘world-states (propositions) and relations between world-states. Let A, B, C,....be 
action terms or action descriptions. Action terms can be simple or complex; the 
latter are formed from simple action terms by act-connectives, some of which are 
analogous to propositional connectives. For example, if A and B are action terms, 
the following expressions are also action terms: 


(AatT1) A+B doing Aor B 
(ActT2) A* B doing A and B together 


It is also convenient to have an expression for the omission of an act, 
(ActT3) Oma: omitting A 


‘Om. is applicable to all actions (world state transitions) which fail to exemplify A, 
Systems of dynamic logic usually also contain act-connectives which have no coun- 
terparts in propositional logic, for example, 


(ActT 4) A:BE A followed by B 
(Act) A*: doing A a finite number of times 


For the sake of simplicity, this chapter considers only complex actions of types 
(Act 1-3) 

Actions change the world, thus an action in a space W of possible worlds or 
situations may be interpreted as a binary relation, i¢., as a set of ordered pairs (1, #) 
such that the action in question can transform the first situation into the second. 
‘The ordered pairs assigned to an action-term A may be called the possible perfor: 
‘mances ofthe action A. A world-state wis said to be possible relative to w or accessible 
from 1 iff it is possible for some action or sequence of actions to lead from m to m. 
Denote the accessibility relation by Poss, and let Poss, be the set of transitions which 
originate from m. The semantics of SDL can be applied to action terms in a relatively 
straightforward way. In SDL, possible worlds are divided into acceptable (deontic- 
ally correct) and unacceptable worlds; and, in the deontic logic of action, world state 
transitions can be divided in an analogous way into deonticaly acceptable (legal) 
and deontically unacceptable (illegal) transitions. Let Leg. be the set of legal transi- 
tions which originate from , and let Il, be the set of illegal transitions from m. 
It is assumed that any possible transition from w is ther legal or illegal (there is 
no deontic indeterminacy), and no transition is both legal and illegal; in other 
words 


(Ddet) Leg, U Ill, = Poss, 
(Deons) Leg, Nil, =@ 
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‘The assumption that there is some legal way out of every situation, in other words, 
(DactD) For every € W, Leg, is nonempry 


‘corresponds to principle (D) of SDL, i., the postulate that every world (situation) 
has some deontic altemative. Let I be an interpretation function which assigns 10 
‘ach action A its possible performances (a subset of Wx W), and let 1,(A) be the 
performances of A which originate from m; thus 1,(A) C Poss, 

‘The basic normative concepts (deontic action modalities) can be defined by these 
truth conditions: 


(CRact) wk FAM L(A) CT, 
(CPact) ut PAP L(A) Lege 
(CO.act) wb OA iff 1 OmA) ¢ Ul, 


‘These definitions are simple generalizations of the truth-conditions of normative 
propositions in SDL. According to (CF.act), an act A is prohibited in a given 
situation if every possible performance of A at that situation is illegal, and A is 
permitted iff it can be performed in a legal way. According to (CO.act), A is 
obligatory at w iff its omission at n would be illegal. 

‘According to (CP.act), the permissibility of an action A means that some possible 
performances of A (in a given situation m) are deontically acceptable, For example, 
‘A may be permitted in this sense if it can be performed together with some other 
acts, This is a ‘weak’ concept of permission which corresponds to that defined in 
SDL. In the present framework, i is possible to define another concept of permission 
which may be termed a strong permission. When one says that an act A is permitted 
in a given situation, one often means that A itself is not illegal, ie., that no sanction 
is attached to A, and not only that some (possible) performances of A would be 
deontically acceptable in the situation. This sense of permission can also be expressed 
in the form of a conditional; ifthe agent a were to do A, a would not do anything, 
illegal. The truth-conditions of such a conditional can be formulated by means of a 
selection function fF which selects from 1,(A) the transitions which exemplify A but 
change the original situation w in other respects in a ‘minimal’ way. Such transitions 
may often be described by saying that the agent does only A. The concept of strong, 
permission may be defined as 


(CPract) PAIR UA), ) C Legs 


‘One might say that the function selects from (A) the minimal performances of 
A. For example, if Oscar's mother gives him permission to take one cookie, it means 
that the action of taking one cookie is acceptable; in other words, the mother would 
not punish Oscar if he were to take one cookie and do nothing else. On the other 
hand, it is permitted for a driver to flash her right turn signal - but only if she 
is going to make a right turn as well. The latter action is an example of weakly 
permitted action, whereas the former action (taking a cookie) is strongly permitted. 
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‘The formulation (CP-act) is analogous to one of the standard ways of expressing 
the truth-conditions of conditionals by means of a selection function f(l(p), #) 
Which selects, for each proposition Ip) and a situation u, the p-worlds closest (most 
similar) to 1; a conditional p> q is true at w iff the consequent g is true at all 
selected p-worlds. Thus (CP-act) fits the most natural reading of a strong permis- 
sion to do A; if you were to do A, you would not be doing anything illegal. The 
selection function f used in (CP*act) selects the ‘minimal’ performances of from 
the set of all possible performances of A, just as the truth of a conditional p> q 
is determined by the selection of the p-worlds minimally different from the actual 
situation (or the situation where the conditional is being evaluated) (Hilpinen, 
1993b, p. 309). 

If the disjunctive permission “You may do A or B° is interpreted as a strong 
permission in the sense defined by (CP*act), the truth of 


P(A+ BD PAK PB (8.40) 
depends on whether 
SULA), W) ULB), 0) C SULA +B, #) (841) 


In other words, it depends on whether the minimal performances of a disjunctive act 
include the minimal performances of both disjuncts. The example (8.15) (on page 
167) suggests that this need not always be the case; therefore (8.40) is not a logical 
truth, but it may hold in many cases; and, for pragmatic reasons, it may normally be 
‘expected to hold in situations in which permission sentences are used performatively, 
because otherwise it would noc be clear what has been permitted, i.c., which per: 
formances of A+B have been made deontically acceptable. In the example (8.15), 
the disjunctive permission is given together with the information that the per: 
missibility of one of the disjuncts will be determined by another norm authority, 
and, consequently, there is no reason to assume that (8.40) should hold in the 
‘example, 


8.7. Deontic Logic and the Logic of Agency 


In most recent systems of the logic of the ought-to-do, simple action descriptions 
are not regarded as primitive terms, as outlined above, but are obtained from 
propositional expressions by means of an action operator which is usually read ‘a 
sees to it that’ or ‘a brings it about that.’ Thus simple action descriptions have the 
form Dea, p), where Do is a modal operator for action or agency, @ names an 
agent, and p is a propositional expression. This analysis of action sentences goes back 
to the eleventh-century philosopher St. Anselm, who investigated the formal proper- 
ties of the Latin verb facere, ‘to do” (Segerberg, 1992). 

Kanger (1972) presented an interesting analysis of the concept of seeing to it that 
p. He regarded a statement of the form ‘a sees to it that 9°, Do( a, p), 28 a conjunction 
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(CDO) Dea, p) = Dra, p) & Dnka, p) 


where Ds may be said to represent the sufficient condision aspect of agency and Dn 
stands for the necessary condition aspect of agency. Kanger read Ds(a, p) as p is 
necessary for something docs, and Dn(a, p) as pis sufficient for something, a does. 
‘These readings are equivalent to 


Dr(a, p): Something a does is sufficient for p (8.42) 
Dra, p): Something a does is necessary for p (8.43) 


Kanger interpreted the agency operators Ds and Dn in terms of two alternativeness 
‘lations on possible worlds: 


(CDS) wk Ds(a, p) iff we p for every w such that Sox, ») 
(CDN) mF Dn(a, p) iff wp for every w such that Sous, ) 


‘The worlds w such that Sox(a, m) can be regarded as worlds in which the agent # 
performs the same actions as in w. Kanger (1981 [1957]) took Spy4¥, m) to mean 
that ‘the opposite’ of everything a docs in 1 is the case in w. One possible interpre: 
tation of this expression is that a docs not do any of the things she does in m, but 
(for example) is completely passive (insofar as this is possible), or, for any action B 
that a performs at u, she does something else (ie., some alternative to B) at w. 
This analysis of the concept of agency has a form which has become widely 
accepted in the recent work on the logic of action. The first condition, the Dr 
condition, may be termed the positive condition, and the second condition, the Dn: 
condition, may be termed the negative condition of agency. The latter condition is 
counterfactual condition of agency; it states that if the agent had not acted the way 
she did, p would not have been the case. An analysis of this kind was put forward by 
von Wright (1963, 1968); other versions of the analysis of agency by means of a 
positive and a negative condition have been formulated by Aqvist (1974), Aqvist 
and Mullock (1989), Lindahl (1977), Porn (1977), and more recently by Belnap, 
Horry, Perloff, and others; see Horty (2000) and the references given in it. 
Philosophers have disagreed about the formulation of the negative condition. 
Porn (1977) has argued that, instead of Kanger’s Dn-condition (CDN), one should 
accept only a weaker negative requirement: Dy(a, +p), abbreviated here Cn(a, f). 


(ACN) wk Cn(a, p) iff wt =p for some w such that Spy(a, m) 


‘This condition can be read: but for a’s action it might not have been the case that 
2 (Pom, 1977, p. 7); in other words, it was not unavoidable for # that p. Agvist 
(1974, p. 81)'has accepted 2 similar weak form of the counterfactual condition. 
According to Porn and Aqvist, the negative condition should be formulated as a 
‘might-statement or a might-conditional, not as a would-conditional. (For a discussion 
Of different forms of the positive and the negative condition of agency, see Hilpinen 
(1997),) 
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The De-operator makes it possible to distinguish four modes of action with 
respect to a result (state of affairs or event) 7: 





Dela, p): a sees to it that p 
ADola, p): a does not see to it that p 
Do(a,—p): a sees to it that ap 


ADo(a, =p): does not see to it that —p 


‘The combination of different modes of action with deontic concepts makes it pos- 
sible to represent several types of obligation and permission and different legal oF 
<deontic relations between individuals. For example, consider a state of affirs involv- 
ing two persons, Fla, b). According to Kanger (1981) and Kanger and Kanger 
(1966), the De-operator can be combined with deontic operators to distinguish four 
basic types of right (or different sense of the expression ‘right’): 


(RL) ODe(d, Fla, 6) 

(R2)_ Dea, F(a, b)) ® P>Do(a, F(a, b)) 
(R3)_ -O-Do(a, F(a, b)) = PDo(a, F(a, 6) 
(RA) OnDo(b, Fla, 6)) 


(RL)-(R4) define four basic normative relations between and & which from a’s 
perspective can be regarded as different relational concepts of right. In (RI), & 
has a duty to see to it that F(a, 6); this is equivalent to a's claim in relation to & 
that F(a, 6). (R2) can be described as a’s freedom (or privilege) in relation to b that 
F(a, b}; this means that a has no obligation to sce to it that F(a, 6). Kanger called 
(R3) a's power in relation to b that Fa, 6), and (RA) a's immunity in relation to 6 
that Fla, 6). The replacement of the state of affairs Fla, ) by its opposite F(a, #) 
Yields four additional concepts of right which Kanger and Kanger (1966) called 
counterclaim (RV’), counter-freedom (R2"), counter-power (R3"), and counter- 
immunity (R4’). Kanger and Kanger called the eight relations defined in this way 
‘simple types of right. ‘The normative relationship between any two individuals with 
respect to a state of affairs p can be characterized completely by means of the 
conjunctions of the eight simple types of right or their negations. There are 2" = 256 
such conjunctions, but the simple types of right are not logically independent of 
cach other; according to the logic of the deontic O-operator and the agency operator 
‘Do, only 26 combinations of the simple types of right or their negations are logically 
consistent. Kanger and Kanger (1966) called these 26 relations the ‘atomic types of 
right.” The atomic types provide a complete characterization of the possible legal 
relationships between two persons with respect to a single state of affairs. Iti perhaps 
misleading to call these 26 relations ‘types of right,” because they include as their 
constituents duties as well as claims and freedoms. Thus Kanger's theory of normative 
relations can be regarded as a theory of duties as well a rights (Lindahl, 1994). 
Kanger’s concepts (RI)-(R4) correspond to the four ways using the word ‘right’ 
(or four concepts ofa right) distinguished by Hohfeld (1919), from which he adopted 
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the expressions ‘privilege’, ‘power’, and ‘immunity’. Although Kanger apparently 
intended (R1)~(R4) as approximate explications of Hohfeld’s notions, his concepts 
of power and immunity differ from Hohfeld’s. According to Kanger, both power 
and freedom are permissions; a power consists in the permissibility of actively secing 
to it that something is the case, whereas freedom means that there is no obligation 
to see to it that the opposite state of afar should be the case. Lindahl (1977) and 
others have argued that Hohfeld’s concept of power should be analyzed as a legal 
‘ability rather than a permission (a cam rather than a may); see Bulygin (1992), 
Lindahl (1994) and Makinson (1986), 

‘An agency operator such as the De-operator considered above can be iterated, and 
it is possible to form sentences which contain several nested occurrences of deontic 
‘operators, agency operators (or action operators), and epistemic operators, relativized 
to possibly different agents. This feature has facilitated the applications deontic logic 
and the logic of agency to the analysis of complex social and normative phenomena, 
for example, the analysis of different kinds of rights relations and other ormative 
‘elations (H, Kanger, 1984; Lindahl, 1994; Makinson, 1986), governmental structures 
and the concept of parliamentarism (Kanger and Kanger, 1966), normative positions 
and normative change (Jones and Sergot, 1993; Lindahl, 1977; Sergot, 1999), the 
analysis of normative control, influence, and responsibility (Pom, 1989; Santos 
and Carmo, 1996), and the analysis of trade procedures and the concept of fraud 
(Firozabadi et al., 199), 


Suggested further reading. 


Deomtie Lagi: Introductory and Sptematic Readings (Milpineo, 1981 (1971}) contains some 
pioneering contributions to deontic Jogi, including those by Kanger and Hintitha, New 
Suds in Deonte Lagi: Norms, Acions and the Foundacions of Ethics (ipinen, 1981) 
‘contains paper on the ontology of aoems, deontc paradoxes, temporal deonti ogi, ad the 
interpretation of quantifiers ia deontic logic: Horty's Agency and DeonticLapic2000) analyzes 
the concepts of action and agency in deontc logic and discuss the relevance of deontc logic 
to ethical theodes, for example, utilitarianism. Norms, Lapicy and Information Sema, edited 
by McNamara and Pratken (1999), contains recent papers on the philosophical foundations 
of deontic logic, norm confics, and computer systems applications of deontic logic. The 
papers in Nute's Defense Deon Lapic (1997) analyze defesble reasoning i normative 
dlscoune, and Carmo and Jones ess “Deontic Logic and Contrary-to-Duies™ (2001) i a 
{good survey ofthe problems about CTD obligations. 
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Chapter 9 
Epistemic Logic 
J.-J. Ch. Meyer 


9.1. Introduction: A Brief History of Knowledge 


Knowledge has been a subject of philosophical study since ancient times. This is not 
surprising, since knowledge is crucial for humans to control their actions and the 
appetite for acquiring it seems innate to the human race. Philosophy, therefore, has 
always occupied itself with the question as to the nature of knowledge. This area of 
philosophy is generally referred to as epistemology from the Greek word for know: 
ledge: episteme. Plato defined knowledge as “justified true belief” and this definition 
has influenced philosophers ever since; cf, Gettier (1963) and Pollock (1986). 
Although sensible, this definition does not yet explain the nature of knowledge, 
since all of the three notions of ‘justification’, ‘truth’, and “belief” are not yet clear 
and still subject to discussion. It would go beyond the scope of our purposes here to 
0 into this at this moment, but itis touched on later in this chapter 

Further issues concerning knowledge include the question of how it comes to us 
‘There is the controversy between rationalists, such as Plato and Descartes, who 
argued that knowledge only comes via reason(ing), and empiricists, such as Locke 
and Hume, who maintained that knowledge derives from sense experience, Kant 
considered categories of anabyical knowledge (‘derivable by purely logical argument’) 
versus gmthetic knowledge (where this is not the case) and of a posteriori knowledge 
(based on experience) versus a priori knowledge (where this is not the case), which 
led to a big debate whether smchetic « priori knowledge is possible. 

As is the case with so many things, in the twentieth century, the notion of 
knowledge became amenable to formal-Jogical analysis. With the development of 
formal mathematical logic in the second half of the nineteenth century, the formal 
approach also became available to the study of philosophical notions such as time, 
necessity, obligation, and also knowledge itself. Most of these logics are collected 
‘under the heading of modal logics, namely logics of certain modalities such as 
necessity and possiblity. [See chapter 7.] While (formal) modal logics had been 
around since the publication of C. I. Lewis’ (1912) paper on an axiomatic approach 
Of strict implication, the inception of formal modal epistemic logic is often taken to 
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bbe Hintikka’s (1962). The period 1912 up to the 1950s are referred to by Bull and 
Segerberg, (1984) as the ‘First Wave’ of (formal) modal logic, where syntactic and 
algebraic approaches were prevailing, while the period of roughly 1950-80, where 
the focus shifted towards model-theoretic semantic approaches, is referred to as the 
“Second Wave’. Hintikka’s work marks the beginning of this Second Wave. In this 
chapter, however, epistemic logic is treated as a particular modal logic and models 
are considered that have become standard for modal logics in general, namely so- 
called Kripke models, based on the work by Kripke (1963), another leading figure 
in the Second Wave of formal modal logic. These models employ the notion of a 
posible world dating back to the philosopher and mathematician Leibniz. Carnap, 
Prior and Kanger also contributed to coining the notion of a possible world model 
(Bull and Segerberg, 1984). 

In the 1980s, computer scientists and researchers in the area of artificial intelli- 
‘gence (Al) picked up the subject of epistemic logic as a means to reason about the 
knowledge ascribed to processors in processes of computation and that of knowledge- 
based systems, such as advanced databases, expert systems and so-called agent- 
‘oriented systems, respectively. (In an important sense, this work belongs to a kind of 
“Third Wave’ of modal logic: the use of modal logics in application areas such as 
‘computer science, linguistics and Al.) This chapter reviews briefly their contributions 
to epistemic logic and its application, since these concentrate on slightly different 
‘but also quite interesting aspects of knowledge, and their work also, in its turn, has 
influenced philosophers again. (Moreover, links were established with another inter- 
esting area of AI - nonmonotonic reasoning ~ which has some definite relations 
with philosophy as well [see chapter 15].) 


9.2. The Modal Logic Approach to Knowledge 


‘This section looks at the basic idea behind modal epistemic logic: modeling know- 
ledge or rather ignorance (as shall be seen) by means of accessibility relations as they 
are present in Kripke models. 

‘To prepare for the formal treatment, frst consider the following situation. Imag- 
ine a person in Amsterdam wondering what the weather is like in New York (pos- 
sibly since a friend of his is there on holiday), in particular whether itis raining in 
New York. Since he has no information pertaining to this (and clearly cannot obtain 
this information by direct observation ~ unless he is clairvoyant or has access to an 
internet site with this information, which is assumed not to be the case), this person 
will consider two possible situations, one in which it rains in New York, and one in 
which this is not the case. Note that the lack of knowledge of an agent can be 
represented as the agent’s considering a number of situations as possible. In this 
example, there are only two such possible situations, resulting from being ignorant 
about one propositional item, but clearly, if one lacks knowledge about more items, 
the number of possible situations that are held possible will increase. Generally, if 
fone has ignorance about the truth of m propositional atoms, one has to consider 2° 
situations. For example, if one totally lacks knowledge about whether it rains in New 
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York (p) and whether it rains in Los Angeles (4), one has to reckon with four 
situations: one in which both p and q are true, one in which p is true and 4 is false, 
fone in which p is fase and q is truc, and one in which both p and 4 are false. Since 
the situations to be considered stem from (lack of) knowledge, they are called 
‘pistemically alternative worlds or shorty epistemic alternatives. 

‘The idea of considering several epistemic alternatives in case one has not complete 
knowledge about the situation at hand can be molded perfectly into the framework 
of Kripke-style possible world semantics. Assume a set Pof propositional atoms. Use 
the symbols T and F for the truth values (true and false, respectively). Formally a 
Kripke model is a structure of the form: 


Definition 9.1 A Kripke model is a structure 24 of the form (S, x, R), where 


+ Sis a non-empty set (the set of possible worlds); 

‘© x: S-+P-+(T, F] isa truth assignment function to the atoms per posible 
world; 

+ RC SxSis the knowledge accessibility relation, 





By means of a Kripke model, one can represent exactly what an agent considers as 
the epistemic alternatives in a certain situation: given a situation (represented again 
by a possible world s€ 5), the epistemic alternatives for the agent are given by the 
set [7 € S| Rls, )), f<. all possible worlds ¢ that are accessible from s by means of the 
relation R. 

‘Thus the example above can be represented in a Kripke model as follows; 
see figure 9.1, Suppose that the actual situation at hand (which the agent docs not 
have complete knowledge about) is that it rains in New York but not in LA, 
represented by a state % € S for which it holds that #()(p)=T and (49) = E. 
Now the model can be represented by taking $= (4, §. 5. Jy where 4 is such that 
(Xp) = #19) 4) = T, 4 a8 above, 4 is such that (5)(p) = F and x(s,\(q) = T, and, 
4 is such that X(s,X,p) = (5X9) = F. The relation R of the model is given by R(s, #) 
for every £€ S. To represent that the agent has more information in situation 1, 
€.8, that the agent knows that itis raining in LA (perhaps because the situation is so 
‘unusual that it has been on the news), the relation R in the model can be extended 
by stipulating R(s, ¢) for ¢= 4, s Now the agent has no doubt anymore about the 
truth of proposition g, but is still ignorant about the truth value of proposition p. 





Co 





Figure 9.1 
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On the basis of the Kripke models modal logic of knowledge can be devised 
To this end, introduce a modal operator K, to be interpreted as ‘it is known that,” 
and give it a formal semantic by a clause: for Kripke model a¢=(S, x, R) and s€ S, 


4, s+ Kg iff (if and only if) for all ¢ with R(s, ) it holds that a4, 5 @ 


‘This clause states that, in 2 possible world s it is known (by the agent) that the 
formula is true iff @ is truc in all the worlds £ that the agent deems epistemic 
alternatives. In other words, although one may have doubts about the true nature of 
the world (if one considers more than one epistemic alternative as possible), one has 
no doubts about the truth of ¢: this formula hokds in all epistemic alternatives. 
‘Thus, it can really be said that, in this case, one knows the formula @. 

‘To complete the logic, assume that, besides propositional atoms from ?, formulas 
can also be composed by means of the usual propositional connectives -~ (not), 
‘+ (and), v (or), —> (implication) and ++ (bi-implication), with their usual semantics, 
such a8, 8.5 


Mh, SF =p if not Me 5 @ 
MSF ON WIT, sé @and Ms y 


Propositional atoms p € Pare, of course, interpreted by using the truth assignment 
function x: 


MG sk pi m(aKp)=T 


Finally, a formula ¢ in this logic is said to be valid, notation F 9, if M6 s* @ for all 
Kripke models = (S, x, R) and all s€ 5. 

By interpreting the operator K in the above way, one directly obtains a number of 
valiitie: 


Proposition 9.1 


1 E Kip y) + (Ke Ky) 
2 IfF@ then Ke 


This proposition says that by modeling knowledge in this way, itis closed under 
logical consequence. Furthermore, validities are always known, With respect to an 
idealized notion of knowledge, these properties are certainly defensible. For more 
practical purposes (when using the notion of knowiedge in certain applications, ¢.g., 
describing the knowledge of human or artificial beings such as robots) they may be 
undesirable. In this case, one may speak of the so-called problem of lagical omniscience, 
discussed in section 9.5 below. For the time being, one accepts these properties of 
knowledge, and wonders what other properties knowledge should satisfy 

Finally in this section, note that the valid formulas with respect to the class of 
Kripke models thar have been introduced here can be axiomatized by the following, 
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(P) any axiomatization of propositional logic 
(kK) Kiev) > (Ke Ky) 


and rules modus ponens (MP) and 
(Ne) 9/Ke 


‘The validity (K) is generally referred to as the K-axiom, while rule Ny is called the 
necestation rule 

‘Technically, one can show that this system K is sound and complete with respect to 
the class of all Kripke models, which states that the set of theorems in this system is 
exactly the set of valdities (with respect to the class of all Kripke models) Since the 
proof of this is rather technical, it is omitted here, but it can be found in many 
textbooks on modal logic; see, for example, Chellas (1980), Hughes and Cresswell 
(1984) and Meyer and van der Hoek (1995), [and chapter 7] 


9.3. The Systems T, $4, and $5 


{As scen in section 9.2, the notion of knowledge, as captured by a modal logic based 
‘on Kripke models ofthe form introduced there, satisfies certain properties. However, 
some properties that intuitively hold of knowledge are not validities in this setting. 
For instance, one of the defining properties of knowledge is that it is trac! That is, 
in a formula: Ko ¢, if it is known that @ then @ must be true. This formula, 
however, is not a validity in the framework given thus far [see chapter 7]. 

This can be remedied, however, by putting constraints on the class of Kripke 
models being considered. By stipulating that the accessibility relation Ris reflexive, 
i.e satisfies the constraint R(s, §) for all s€ S, then the formula Kg —+ @ becomes 
valid with respect to this new class of models. 


Proposition 9.2 Any Kripke model af=(S, x, R) where R is reflexive, satisfies 
Mr Keo. 


Extending system K of the previous section with axiom K@—+ ¢ gives the system 
referred to as system T. This system can be shown to be sound and complete with 
respect to the new class of models, i.e. the class of all Kripke models in which the 
accessibility relation is reflexive ([sce chapter 7]; see also Chellas (1980), Hughes 
and Cresswell (1984), and Meyer and van der Hock (1995).) 

Furthermore, it would also be reasonable to have a property stating that know- 
ledge is known itself, expressed by the formula Ke—> KK@: if one knows @, then 
fone also knows that one knows @. This formula is not a validity in the setting 
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presented thus far either. But, again, constraints on the class of Kripke models can 
be introduced to overcome this dificulty. If the accessibility relation R in a model 
M-=(S, x, R) is required to be transitive, i., satisfies the constraint that 


R(s, 1) RU, w) => Rs, w) 


for all 5, 5, w€ 5, then the formula Ke» KK@ becomes a validity with respect to 
this class of models. 


Proposition 9.3 Any Kripke model a¢(S, x, R) where R is transitive, satisies 
Mr Kg Ko. 


Extending the system T with axiom Kg» KK@ gives a system called $4, which is 
4 well-known axiom system for knowledge (at least in philosophy). The axiom is 
called the positive introspection axiom, since it states something about the agent's 
‘own knowledge about knowledge. Again, it can be shown that $4 is sound and 
complete with respect to models with accessibility relations that are reflexive and 
transitive ({see chapter 7]; see also Chellas (1980), Hughes and Cresswell (1984) 
tnd Meyer and van der Hock (1995).) 

Now one can atk the question whether there is more to knowledge? Can further 
properties of knowledge be identified? This issue will be discussed later, but first itis 
felevant to mention here that in computer science and AI, where epistemic logic is 
employed to describe the ‘knowledge’ of artificial systems like (distributed) compu- 
ter systems, information systems and “intelligent” systems such as ‘agent systems” and 
robots (Meyer and van der Hock, 1995), itis customary to also add another axiom, 
which says something about knowledge of ignorance. 

‘This axiom is called the negative introspection axiom: 


Ako KK 


It states that if the agent does nar know formula @, then it knows that it does not 
know @. Of course, for human agents this axiom is highly unlikely to hold in 
general, since one may not even be aware of one’s not knowing @. However, for 
some artficial agents, dealing with finite information, like only a finite set of 
propositional atoms and a finite set of formulas that it knows, the truth of this axiom 
‘may be argued (informally) like this: if the artificial agent does not know a formula, 
then this formula does not follow from the agent's finite information, and the agent 
is able to detect this, so that it knows that it does not know the formula, Also, in 
some cases, the validity of the axiom follows directly from the special kind of models 
that is used in applications — as in the case of using epistemic logic in distributed 
systems, ef. Halpern and Moses (1990) and Meyer and van der Hoek (1995). 

To cater to the validity of the negative introspection axiom, one has to constrain 
the (accessibility relations of the) Kripke models even further. One can show that by 
requiring the relation R to be an equivalence relation, namely a relation that satisfies 
the three properties: 
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+ Reflexivity: Ris, ») for all 5 § 
© Transitivity: Rls, #) 6 R(s, w) = Rls, u) for all 51, we S 
© Symmerry. Rls, £) = Rls, 3) for alls, 6S 


‘one obtains that the negative introspection axiom as well as all axioms of system S4 
are valid with respect to this new class of Kripke models. (And, of course, also the 
rules of modus ponens and necesstation remain sound.) 

‘The new system is known as $5, and, as noted above, is very popular among 
computer scientists who use epistemic logic.' One of the reasons isthe very intuitive 
interpretation of the models with equivalence relations as accessibility relations that 
are briefly discussed below. For the reason given before, philosophers do not regard 
‘$5 as a correct logic for knowledge, They usually stick to $4, and possibly some 
logics in between $4 and $5. The system $5 can be shown sound and complete with 
respect to the class of Kripke models in which the accessibility relations are eq 
valence relations ((see chapter 7); see also Chellas (1980), Hughes and Cresswell 
(1984), and Meyer and van der Hock (1995).) 

Equivalence relations divide the set of possible worlds into equivalence classes, the 
members of which are all mutually accessible. An equivalence class is, $0 to speak, a 
bunch of worlds that are epistemic alternatives of each other. One can show that in 
the case that one has only one knowledge operator as here, one can restrict oneself 
to equivalence relations with only one equivalence class without losing soundness 
and completeness ofthe logic. Such a model is particulary simple: it just consists of 
set of states which are al! mutually accessible, or speaking in epistemic terms: are 
all each other's epistemic alternative, So, in such a case, it does not matter what is 
the actual world where one is considering alternatives: for each world there is exactly 
the same set of alternatives: the whole set S of possible worlds (Meyer and van der 
Hock, 1995). 

‘A final note: as can be verified ~ se, for example Meyer and van der Hock (1995) 
= the system $8 contains a redundancy: the positive introspection axiom can be 
deleted since it can be derived from the other axioms together with the rules. 
"Nevertheless, in the sequel when speaking about the system $5, itis convenient to 
include the positive introspection axiom as wel 





9.4. Belief: The Systems K45 and KD45 


Belief is mostly regarded as a weaker form of knowledge (but see later in section 
9.6.2). The crucial difference between knowledge and belief is that the former must 
be true whereas the latter need not. When considering properties (axioms) of belief 
rather than knowledge, those of knowledge can be copied except for the one stating 
that knowledge is true. Mostly, the modal operator for belief is denoted B: Bg is 
read as ‘it is believed that or ‘the agent belicves that @.” Copying the system S5 
without the ‘truth axiom’ for belie gives the system known as K45: 


‘© any axiomatization of propositional logic 
+ Bip y) > (Bo By) 
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+ Bo BBe 
+ Bp Be 


and rules modus ponens (MP) and 
(Ny) 9/ Be 


‘The class of Kripke models with respect to which system K45 is sound and 
complete consists of those models in which the accessibility relation R (now used to 
interpret the operator B, of course) is transitive and Euclidean, the latter meaning 
that R satisfies: 


Rs, 8). Rls, #) = RCs, w) 


for alls, fm, € S; ([see chapter 7], also Meyer and van der Hock (1995).) 

Mostly, also it is stipulated that beliefs should be consistent, in a formula: 
AB\p Ap), for some p € P. Adding this formula to the system K4S as an axiom 
(often called the D-axiom, since it was held a typical axiom of deontic logic {see 
chapter 8)) yields the system KD45, or weak $5. This system can be proven sound 
and complete with respect to Kripke models in which the accessibility relation is 
transitive, Euclidean, and serial, where seriality of a relation R means that for all 
5€ Sthere exists ¢ € Ssuch that R(s, 4). This property expresses that, in any possible 
world, the agent considers at least one epistemic alternative. 

‘As with $5, it can be shown (Meyer and van der Hock, 1995) that K(D)45 is 
(sound and) complete with respect to a class of simpler models, in this case models 
consisting of an ‘actual’ world %, and a set Sof worlds not including % such that the 
accessibility relation R satisfies Ris, 3) for each s€ S and R(s, #) for any  ¢E S. In 
case one considers KD45, the set $ is non-empty, whereas, in the case of KAS, it 
may be empty. This provides a neat picture which can be interpreted philosophically 
in a very intuitive way: these simple models for K(D)45-belief consist of an actual 
world (representing the current state of the extemal world) together with a set of 
epistemic alternatives, or pat differently, an actual world and an (S8-)epistemic 
model, which in the case of K4S may be empty (representing inconsistent belie) 
In general, the actual world may have nothing to do with the epistemic model, 
reflecting the fact that beliefs may be ‘counterfactual’ in the sense that they may be 
false in realty. 

Note: contrary to the case of $5 the positive introspection axiom for belie is not 
redundant in the systems K45 and KD45! 


9.5. Logical Omniscience: The Problem and Some Solutions 


[As scen in a previous section, a modal approach to knowledge (and belief) based on 
Kripke models of the kind as defined thus far yields that knowledge (belief) is closed 
under logical consequence and that validities are known (believed) (proposition 9.1). 
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In fact, there are a number of further properties, collectively called properties of 
lgical omniscience, since they have to do with some idealizations on the part of the 


knowing (believing) agent (here © stands for either the knowledge operator K oF 
the belief operator B): 


Proposition 9.4 


LOL: DeADie+y)+Oy 
Loz: Fe =tOe 

103: Fe y=kOp>Oy 
LOs: Fee yr Ope Oy 
LOS: ¥(GevOy) ev ») 
106, FD e+ Oiev y) 

L107; (Gpa0-9) 


Properties LO1 and 1.02 were already mentioned in proposition 9.1 for K. More 
precisely, LOI says that if both @ and the implication @-» w is known (believed) 
then also is known (believed). LO3 is a similar propery but slightly different: if 
some formula ¢ is known (believed) then also everything (¥) that is a logical 
consequence is known (believed). LO4 says that logically equivalent formulas are 
either both known (believed) or both not known (believed). LOS says that if both 
is known (believed) and y is known (believed) then also the conjunction of @ and 
‘ys known, LO6 says that if @ is known (believed) then it is also known (believed) 
that @ oF y. (In fact, this isa direct consequence of property LO3.) LO7 says that 
it cannot be the case that both a formula and its negation is known (believed). 
‘Sometimes, itis very convenient to consider these properties as true, but in some 
‘more practical situations these formulas might be deemed unrealistic. For instance, it 
is very unlikely that human agents will know (believe) all logical consequences of 
their knowledge (beliefs) including all vaidites. Although at first sight a reasonable 
property, even LOS is unlikely to hold for human agents: imagine two logically 
equivalent formulas both of length greater than 10 million characters. These formu- 
las are not even parsable for the unfortunate agent, let alone known to be equiva: 
lent! So, sometimes itis argued that on the grounds of the resource boundedness of 
an agent one has to deny or atleast weaken the properties LOI-LO7. However, this 
is not as simple as it sounds. Recall that these validites are the very properties of 
Kripke-style modal logic as expounded thus far. The formulas LO1-LO6 are true of 
all Kripke models. (LO7 can be denied by taking accessibility relations that are not 
serial, as seen before. However, in the case of knowledge one is sill stuck with this 
property: the models that are associated with the systems $4 and SS have reflexive, 
and thus serial, accessibility relations.) 

‘Therefore, to deny the above properties something ‘non-standard’ is needed. In 
the literature (Thijsse, 1992; Meyer and van der Hoek, 1995) there appear quite a 
‘number of approaches varying in ‘drasticality.” Three such approaches are presented 
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here, starting with a rather radical method considering ‘non-standard’ Kripke models 
in which nonstandard (“impossible”) worlds are present. Here, the focus will be on 
solving the logical omniscience problem for belief rather than knowledge, since, in 
the context of belie, the problem seems to be more pregnant. 


9.5.1. Rantala models 


Rantala models are a non-standard type of Kripke models in which, besides the 
possible worlds, also so-called ‘impasible werlds’ are incorporated (Rantala, 1982), 
‘The idea behind these ‘impossible’ worlds is that, as the name suggests, strange 
things may hold there: in these impossible worlds, anything may be the case, even 
‘contradictions may be true there! Thus these worlds are impossible in the true sense 
of the word. However, they can nevertheless be regarded as epistemic altematives by 
agents which are not ideal reasoners (are less rational). And this is exactly what is 
needed to avoid the agent’s logical ommniscience. 

Formally, (epistemic) Rantala models are structures of the following kind (here 
stands for the whole logical language): 


Definition 9.2 A(n epistemic) Rantala model is a structure 9 of the form 
(S, 0, T, S*), where 


Sis a non-empty set, the set of (possible and impose) worlds; 
* C Sis the set of imposible worlds; 

oF: (S\S* + (P+ (T, F})) U(S* + (L-+ {T, F}), a function assigning truth 

to atoms on possible worlds, and truth to arbitrary formulas on impossible 

‘worlds, is a truth assignment function to the atoms per state; 

+ TC SxS is the belief accessibility relation, for which serality and Rantala- 
model versions of sransisivity and Euclidicity are required? 

1. forall fw S\S* : Rls, #) 8 R(t, m) implies R(s, w), and forall s€ S\S*, 
© S* Rls, £*) & (2B 9) =T implies 9% #F @ for some #” € § 
with Rls, #) 

2 forall s, , wE S\S* : Rs 1) & RCs, w) implies R(t, m), and for alls S\S*, 
# ES*: Rs, #4) & of 4*)(Bo)=T implies 1’ @ for all” & S with 
Rist). 








Formulas in posible worlds s€ S\S*are interpreted in exactly the same way as in 
Kripke models including the clause for the modal operator C1 


A, s+ Cg iff 94, £& @ for every £€ S such that Tis, #) 


However, in imposible worlds s* € S* every formula is regarded as atomic, and 
given its truth value by means of the truth assignment function o: 


AM, s* Fg iff o(8*\(9)=T 
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‘Thus, it may happen that for example, the formula p » pis assigned the value T by 
the function ¢ in an impossible world s* € S*. Formulas are valid if they are true 
in every posible world s S\S* in any Rantala model a= (S, 6, T, S*). This is very 
understandable: the worlds in which one evaluates are the worlds from which one 
takes up a stance and considers epistemic alternatives. Although these alternatives 
‘may be ‘impossible,” the worlds of evaluation represent the actual world and thus 
‘must be ‘possible’ 
‘The feature of allowing for these impossible worlds gives one the possibility to 
deny all of the formulas LO1-LO7, so that none of them are validities with respect 
to Rantala models. For instance, consider the possiblity of denying LO7 in Rantala 
‘models. This is very easy: just by taking a model a(=(S, 9, T, S*) with $= [5 #7), 
St [st], T= (54°) (5% 17)} and 4° 9) = o(6°)(—) = T. Then 96 1 Opn Op, 
i.e, M4 s+ LO7, Moreover by stipulating that o(*@¥ —@) = F one can deny the 
validity of LO2, since now , s¥OX@ v=). In the same way, the other logical 
‘omniscience properties can be denied. Finally, note too that, due to the condition of 
“Rantala-transitivity’ on the model, the positive introspection axiom is a validity 
‘again, as can be easily verified. 








9.5.2. Sieve models 


‘The second approach to avoiding logical omniscience of the agent is quite different, 
‘Again a variation of a standard Kripke model is employed, but now instead of 
introducing nonstandard worlds, the model is endowed with a function A that acts 
as a kind of sieve: it determines whether some formula is allowed to be known 
(believed) (agin and Halpern, 1988). Intuitively, the function 4 expresses some 
kind of awareness on the agent’s part: it indicates whether the agent is aware of the 
formala at hand in a particular situation (world), and thus is amenable to be known 
(believed) by the agent in that word. 

Formally these models have the following form (again, use £ for the whole logical 
language): 


Definition 9.3. A(n epistemic) sieve model is a structure AV ofthe form (S, x, T, 
®), where 


© Sis. a non-empty set (the set of sates or possible worlds); 

+ #:5-+(P((T, F)) isa truth assignment function to the atoms per state; 

+ TC SxS is the belief accessibility relation, which is assumed to be scrial, 
transitive and Euclidean again; 

© A: S— p(4)is the awareness function, assigning per state the set of formulas 
that the agent is aware of; for any s€ S, (3) is assumed to contain all 
instances of the D-axiom and the introspection axioms. 


Let the language contain an ‘awareness’ operator A as well as the epistemic operator Ul. 
‘These are interpreted on a sieve model = (S, x, T, 4) and a state s€ Sas follows: 


AG 5 Ap iff 9 EAs) 
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and 
AM, 5 Clg iff 9 E As) & AE, @ for all ¢ such that TYs, #) 


So from the definition, one can see how indeed the function A acts as a sieve: only 
those formulas are considered as knowledge (belief) that are indicated as being. 
aware of by it. By the condition put on this function (which stares something like 
that the agent is aware of the D-axiom and both introspection axioms), and the fact 
that the rest of the model is a standard KD45 model, itis easy to see that these 
axioms are validties again. 

Since the sieve model approach can only filter out formulas to be known (believed), 
by this approach only the validities LOI-LO6 can be avoided. This is obvious by 
taking a model that contains a possible world s where the formula to be denied, 
say ¥, is not being aware of, namely take A such that y A{s). ‘Then immediately 
1M, s¥ Ay, and hence 9, s¥ Oy. This can be used to show that LOI-LO6 are 
not valid. 


9.5.3. Cluster models 


Finally, here is a method with which one can avoid LO7 while still keeping the 
axiom D (or, in semantical terms, keeping serial accessibility relations). This method 
is strongly related to the use of what Chellas (1980) calls minimal models for so- 
called non-normal modal logic, and goes back to so-called neighborhood semantics 
bby Scott (1970) and Montague (1974 [1968]). Chellas was mainly interested in 
applying it to deontic logic, but something very similar was re-invented by Fagin. 
and Halper (1988) in the context of epistemic logic and dubbed local reasoning by 
means of cluster models. 

Cluster models are variants of standard Kripke models in the sense that instead of 
4 set of epistemic alternatives a se of sts of epistemic alternatives is incorporated in 
the models, The intuition behind this is that what is normally che set of epistemic 
alternatives (as viewed from an actual world) is partitioned in subsets (‘clusters’), 
‘where these clusters correspond to coherent bodies of knowledge while two clusters 
‘ean be mutually incoberent. The typical example of such a partition of knowledge 
(represented by a set of epistemic alternatives) is the theory of mechanics in physics 
which can be partitioned into classical mechanics and quantum mechanics, where 
these two subtheories of mechanics are mutually inconsistent. Nevertheless, and this 
is very important, itis perfectly rational fora physicist to consider both theories and 
apply them when appropriate. [See also chapter 13, section 13.4.] 

Formally, cluster models are defined as follows: 


Definition 9.4 A cluster model is a structure MC of the form (S, x, C), where 


+ Sis.a non-empty set (the set of possible worlds), 

+ :S-+(P-+{T, F)) is a truth assignment function to the atoms per state; 

+ €:8— (p15), such that, for every 5, C(s) is a non-empty collection of 
non-empty subsets (clusters) of S. 
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The belief operator may now be interpreted 25 follows: 
24, st Bp iW3TE CVE TM, srg 


Validity i defined as usual again. 

With this interpretation of belief, one may now indeed deny LO7: take a model 
M=(S, m, C), with S= [5 #}, C(s)= (Ls, [el], and x(3)(p)=T, x(0(9) =F. Then a6 
st Bp» Bop, thus falsifying LO7. Note that, on the other hand, it is still the case 
that 6 5¥ Bip)! 

‘Cluster models as defined above are not yer models of epistemic logic: they do 
not yet satisfy the D-axiom and the two introspection axioms. Thijsse (1992) gives 
necessary and sulficient conditions for turning cluster models into “epistemic cluster 
models.” Since these are rather technical (they follow from a correspondence to 
neighborhood semantics in the style of Scort-Montague, and have 2 topological 
meaning), they are mentioned here without further comments. 

Let ¢"(s) be defined as the set |X| TC X for some TE C\s)}. Then to cater for 
the postive introspection axiom, impose the condition that 


XE Cs) = (XE CME CU) 
and for the negative introspection axiom impose the condition 


XECKN > (XE CMEC) 


9.6. Further Refinements and Extensions 





Having looked at the “standard” treatment of knowledge by means of the systems $4 
and $5, this section discusses some of the more advanced systems that have been 
proposed to deal with knowledge more adequately 


9.6.1. Other systems for knowledge 


Philosophers, who do not judge the system $5 as an adequate formalization of 
knowledge, have asked whether itis possible to find a suitable system for capturing 
the properties of knowledge that goes beyond $4, but stays “below” $5, so to speak. 
Indeed, such systems between $4 and $5 have been proposed (Lenzen, 1980; 
Voorbraak, 1991, 1993). For instance, Lenzen (1980) observed that if one takes 
knowledge to be true bli, ic, defining 





Bene 


where B satisfies the logic KD45, one obtains a logic for K’ that is known under the 
name $4.4, which is the logic $4 together with the axiom 


99 (OKA=K'9> K’9) 
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It is not directly obvious whether this property is intuitively a suitable one for 
knowledge. Other candidate logics of knowledge include that of rationally believed 
objective knowledge K", defined 38 


K 





90 Be 





where K is the usual S5-style type of knowledge and B is a KD45-type of belief, 
‘This operator K” appears to be axiomatized by the logic S4F, which consists of the 
system $4 extended with the axiom (using AI” as the dual of K”) 


Olen MK) > KM ey y) 


See Voorbraak (1993). Finally; the concept of justified knowledge (K) that is con- 
sidered by Voorbraak (1993). By giving a careful and rather ingenious semantic 
analysis of this notion by means of a generalized form of Kripke models, he argues 
that the logic for this type of knowledge should be the system $4.2, which consists 
of $4 together with the axiom 


MKig > KiM'g 


(using M? as the dual of K’). 


9.6.2. Systems for combining knowledge and belief 


After having looked at the notions of knowledge and belief separately, it is natural to 
question what the relations between these two notions are, and whether these rela- 
tions may be formalized in a logical system. Such a system might then be used in cases 
Where it is important to distinguish between an agent's knowledge and beliefs, and 
reason about both these notions. A starting point of such a combined logical system 
‘would be to take the logic $5 for knowledge (K) and add to it the logic KD45 for 
belief (B). OF course, to make this alittle more exciting, one should also add some 
connecting axioms. Kraus and Lehmann (1986) have done so by adding the axioms? 


Ko Bo 
p+ KBp 


The former expresses that knowledge is stronger than belief, whereas the latter 
expresses that if one believes something then one knows that one believes this (a 
kind of generalized form of introspection). In itself, these two axioms are rather 
intuitive and seem innocuous. However, as Kraus and Lehmann (1986) themselves 
observe, they would have also liked to include another intuitive axiom ~ Bp + BK9, 
stating that one believes to know what one believes ~ but this would cause the 
notions of knowledge and belief to collapse, since then K++ Bp would become 
derivable! This indicates that something is wrong with the intuitions. Voorbraak 
(1993) blames it on having the axiom Ko —> Bg, in line with his views on $5 being 
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a weak form of objective knowledge that cannot be stronger than rational belicf (as 
represented by the logic KD45). Van der Hock (1993) offers a diferent solution to 
the problem; he sacrifices the negative introspection axiom for knowledge ~ thus 
essentially adopting an S4-type of knowledge ~ which allows him to add the above 
‘mentioned formula Bg -> BK9 as an axiom as well as the two KB-connecting axioms 
of Kraus and Lehmann above. (In fact, Van der Hock shows that by thus dropping, 
the negative introspection axiom for knowledge, some room is created for an 
unproblematic (simultaneous) addition of some more axioms like Bp > K-Bo and 
Ko» BaK¢, expressing a kind of cross-over negative introspection.) 


9.6.3. Knowledge in a group of agents 


So far, only the notion of knowledge (and belief) ofa single agent has been discussed. 
When considering a group of agents, one can, of course, consider the knowledge 
(K, of every individual agent i, so that a Kripke model of the form to describe the 
knowledge of the various agents may be used: 


Definition 9.5 An (w agents) Kripke model is a structure 24 of the form (S, 
Rigo eey Ry where 


+ Sis a non-empty set (the set of posible worlds); 

© x:S-+(P-+|T, Fl) isa truth assignment function to the atoms per possible 
world; 

© for 15 im, R/C SxS is the knowledge accesibility relation for agent 4, 
assumed to be an equivalence relation 





‘The validities with respect to these (multi-modal) models are simply axiomatized by 
4 multi-modal version of $5: for each K, one takes an $5-axiomatization. 

However, itis also worthwhile examining notions of knowledge that have to do 
with the group as a whole. This has been done by Halpern and Moses (1985). At 
least two such notions come to mind immediately. The first is knowledge that is 
shared by everyone: the facts that every agent in the group knows, denoted by E 
(‘everybody knows’). The axiomatization of Eknowledge is trivial: just take as an 
axiom (assuming there are m agents in the group): 


BOOK GA AKO 
‘Semantically E can be associated with an accessibility relation Ry = Us, R, (intuitively 


this means that all agents pat their sets of epistemic alternatives together in one big, 
set), and define 





a, st Eg iff a £¢ @ for every F with Ry(s, #) 


‘Now the above axiom becomes 2 validity. Intuitively, this is because since the agents 
have collected their epistemic alternatives, the only things they can be sure of as a 
group are those formulas that are true in ail of these alternatives. 
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Moreover, E satisfies the basic properties of a modal (necessity-type) operator, 
‘namely the K-axiom and necessitation rule: 


Bip w) > (Be Ey) 
o/ Ep 


It also satisfies the T-axiom 
Fo>9 


E does not, however, satisfy the introspection axioms, The technical reason is that 
the union of reflexive, transitive, Euclidean relations is again reflexive, but not in 
‘general transitive or Euclidean, That £ fails to satisfy introspection should not come 
as a surprise: It might very well be that every agent of a group (consisting, say, of 
agents 1 and 2) knows some fact p, while, for example, agent 2 does not know that 
T knows p, This situation is described by the formula 





Kipn KpnoK ke 


[As one might verify easly, this formula implies Ep EEp. 
‘The second kind of group knowledge that comes to mind is perhaps the know: 
ledge of some agent in the group, write F for this notion, axiomatized by the axiom 


Foes Kove -v Ke 


However, this is not such an interesting notion. It does not even satisfy the K- 
axiom, A better idea is to look at knowledge that is implicit in the group in the sense 
that if everyone shares his knowledge with everyone, it becomes knowledge in the 
‘group. Semantically, this can be obtained as follows, Consider the sets of the epistemic 
alternatives regarded by the agents separately. If there is communication between 
the agents, they can help each other to rule out epistemic alternatives. In fact, what 
remains after such a group communication, isthe intersection of the sets of epistemic 
alternatives. Thus, one can directly define an accessibility relation R;= 1; Ry and 
associate a modal operator G with it by means of 


A, s+ Gp iff 96, r @ for every F with Rels, 1) 


Of course, one may wonder a8 to the properties/axiomatization of such an operator. 
It is directly clear that apart from K-axiom and necessitation it also satisfies 


Ko Ge 


However, somewhat surprisingly, since this axiom appears to express only that 
R¢G M2, Ry this is already sound and complete, 38 shown by, for example, van der 
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Hock and Meyer (1992). The rather technical details are omitted here, but the 
secret is that this type of modal logic is too coarse to distinguish between models 
where Row Mz R, and those where Re Mz Ry 80 that due to this ‘dficlency” 
cone can still obtain c 

‘Atthough the above notion of group knowledge seems intuitively clear at Grst 
sight, it is not completely evident what it amounts to exactly. This is also reflected 
somewhat in the history of the naming of the operator: the operator was fist called 
‘implicit? knowledge by, for example, Halpem and Moses (1985), then renamed 
‘distributed knowledge by Halpern and Moses (1992). However, as it is shown 
in van der Hock et al. (1999 [1995)), both the properties of implicitness and of 
‘distriburedness’ are debatable for the notion of group knowledge as defined above. 
In particular, it is shown that, without further restrictions on the models, it may 
happen that group knowledge is really stronger than what can be derived from the 
agents? individual knowledge, when pooled together by means of communication, 
which is rather counter intuitive! 

Another very interesting notion that has been introduced and studied in the 
literature is that of common knowledge. Something is common knowledge within a 
group of agents if not only everybody in the group knows it but also the fact that it 
is known by everyone is known by everyone, and the same for this fact, ad infinitum, 
Thus, intuitively one would define common knowledge of @, denoted Ca, as 
Ep Bg. +». However, infinite formulas are not part of our logical language. 

Formally, given an m agent Kripke model a¢=(S, x, Ri, .... Ro), the accessibility 
relation Re associated with the modal operator Cis given as the (reflexive) transitive 
closure of the relation Ry: Re= Rt. This means that Re(s,#) iff there is a sequence 
B= hy five oy Su FstCH that Ry(Sy f1) fOr all O'S i= m1. This means that the 
relation Re connects all those possible worlds that are in 0 or more steps accessible 
via the relation Ry, or in other words, via some relation R,, where at each step a 
different R, may be chosen. 

If the relations R, are assumed to be equivalence relations (SS), this definition 
amounts to the following validites, which are taken as axioms for the modality C: 





(Ke) 94+ W)4(Ce> Cy) 
(te Come 

(Ko) Ge v) > (Go Gw) 
(Ta) Gore 

(40) G9 G69 

Be) G9 > G69 

(KE) Epo (Kg a--- 0 Ka) 
(EC) Cp Ce 


(Cind) gs Ep) > (e+ Cp) 
and to complete the system one takes rules modus ponens (MP) and 
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(ND 9/K: 
(Nd 9/69 





In addition to a multi-agent version of the system $5 (with the axioms and rules 
for each operator K,) the resulting system can again be proven to be sound and 
complete, which due to the rather complex notion of Cis not exactly obtained sine 
‘cura (Meyer and van der Hock, 1995). Note that the modality C satisfies the same 
basic $S-like axioms and rules. Furthermore, note the axiom C-ind, which, as its name 
suggests, is a kind of induction axiom to capture the infinite behaviour of the C- 
‘modality ina finite axiom! In semantical terms, it really is about induction along the 
Re relation. It states that if anywhere along a chain of R,-related worlds it holds that 
if @ holds somewhere, it aso holds one Ry-related world further, then if @ holds at 
the beginning of such a chain then it holds also at every world along the chain (which 
is exactly the same as saying that in the intial world itis common knowledge that @). 


9.7. Conclusion 


This chapter has taken a peck into epistemic (and doxastic) logic, the logic of 
knowledge and belief. More accurately, it has looked at epistemic logic as a special 
branch of modal logic. This has led to consideration of possible world models as a 
suitable semantics for epistemic logic, and the modal systems $4 and $8 for know- 
ledge, and K(D)45 for belief. As seen, this sometimes gave too idealized properties 
of knowledge and belief, giving rise to the problem(s) of logical omniscience. This, 
in tur, gave rise to approaches in the literature where the possible world semantics 
‘was modified (or ‘polluted’ if one prefers this term) to cope with that problem. The 
‘mote properties one wants to avoid, the more one has to deviate (pollute) from 
Kripke-style possible world semantics with non-standard clements. Finally some 
more sophisticated notions and issues have been discussed, such as other systems for 
knowledge that have been proposed in the literature, systems in which knowledge 
and belief can be reasoned with at the same time, and epistemic notions that are 
related to a group of agents. 


Suggested further reading 


First of ll, Lenzen (1980) i “lasc® comprehensive textbook on epistemic logic in German; 
itis written from a philosophical perspective and alo covers the notion of probability. A 
number of stes touched on inthis chapter are caborated much more extensively by Meyer 
and van der Hock (1995). For example, much more attention is paid to the formal aspects of 
the logics of knowledge and belicf, such as the isue of completeness, while the logical 
‘omscience problem and various ways of dealing with i are aso treated in more depth. Here 
100 one may find material on the relation of knowledge (and epistemic logic more in particu: 
lar) with defeasble (oe ‘nonmonotonic’) reasoning in AI. Some ofthe more technical material 
‘on that will abo appear in 2 compact form in the author's forthcoming chapter (Meyer, 
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2001), of the new edition of the Handbook of Philepbical Lagic. Fagin etal. (1995) is an 
influential book, which emphasizes employing epistemic logic for reasoning about dynamic 
(computer-based) systems. Many fundamental results are presented on how knowledge may 
evolve (¢.g, be obtained) within computer networks where the communication links are not 
completely Secure, in the sense tha information may be lost or mutilated in the communica 
tion process. The successful series of proceedings of the TARK (Theoretical Aspects of 
[Reasoning about Knowledge, and later Theoretical Aspects of Rationality and Knowledge) 
and LOFT (LOgic and the Foundations of game and decision Theory) conferences on the 
‘mult disciplinary use of epistemic logic (expecially in computer scicnee and economic theory) 
are abo worth mentioning; see, for example, Bacharach et al (1997), Gilboa (1998), and 
Halpern (1986). Finally, Laux and Wansing (1995) offers a recent collection of papers on 
modern topics in epistemic logic 


Acknowledgments, 


"The author wishes to thank Rogier van Bik and Wiebe van der Hock for discussions on the 
topics treated in this chapter. Rogier i also greatly thanked for bis help with the figure that 
appears in it. 


Notes 


1 Voorbraak (1991, 1993) generalizes the argumcat ro defend 88 asthe logic of eiebuted 
systems to refer to $5 as the logic of ebjectine knowledge, a weak kind of knowledge that 
may be ascribed to artificial systems, like computer-based systems or even a thermometer. 
In this ease, the so-called introspective anos have lite to do with tre introspection by 
an agent, but ater area way of expressing that nested forms of knowledge (hike KK, oF 
K-xK) can always be eliminated by reducing it to non-nested forms of knowledge (like K 
and —K, respectively). 

2. Admiedly, these conditions lick the elegance and beauty of those fr standard Kripke 
tmodel in onder to deal with imponsbe world where truth  dcined ather 
‘by means of the function @. However, it is rather natural to still demand the validity of the 
smmenpecton sons, ad therfore these dion ar ade forthe sk of empl 

3 Actual, Krass and Lehman (1986) propose « much richer sytem volving notions ike 
common knowledge, which we will encounter later on. Here we consider the part of the 
system involving only the modal operates K and 
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Chapter 10 


Temporal Logic 
Yde Venema 


10.1. Introduction 


‘Time must be the most paradoxical concept our minds have to deal with. To quote 
from the Confesions of St. Augustine: 


‘Whar then, is time? If no one asks me, I know; but if 1 wish to explain it to someone 
‘who should ask me, I do noe know. 


‘One of time’s most puzzling aspects concerns its ontological status: on the one 
hand, itis a subjective and relative notion, based on our conscious experience of 
successive events; yet, on the other hand, our civilization and technology are based 
fon the understanding that something like objective, absolute Time exists, Some 
philosophers have taken this paradox so far as to conclude that time is unreal; 
others, accepting the existence of absolute time, have engaged in heated debates 
regarding its structure, be it linear or circular, bounded or unbounded, dense or 
discrete, 

Even leaving such metaphysical issues aside, time obviously plays such a funda- 
‘mental role in our thinking that there is a clear need for precise reasoning about it, 
such as is seen in physics, formal linguistics, computer science, and artificial intl 
ligence (AI). While these enterprises are not necessarily concemed with the same 
concept of time, they all could go under the heading of temporal logic. Often, 
however, a more restricted, technical definition is used in which temporal logic ~ or 
tense logic ~ is a branch of medal logic, an approach that began about forty years 
ago with the work of Arthur Prior. This chapter is largely confined to this modal 
perspective, though, as shall be seen, this stil includes a great variety of systems. 

Section 10.2 discusses some of the most well-known mathematical modelings of 
time. These are the structures the formal languages of temporal logic are designed 
to talk about. The main part of this chapter, section 10.3, is devoted to a fairly 
detailed exposition of Prior's basic tense logic; the aim of this is not only to present 
this particular system, but perhaps even more to introduce the kinds of questions 
that temporal logicians tend to ask. Sections 10.4 and 10.5 describe some extensions 
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and alternatives to this base system. Section 10.6 sketches some developments that 
hhave taken place over the last ten years or so. Finally, the Epilogue attempts to 
answer the question: what és temporal logic? 


10.2. Flows of Time 


Before starting a discussion of various logics of time, it helps to look at some 
standard mathematical models of time. When asked to think of time in an abstract 
‘way, many people will form a picture of a line ~ only the simplest of the many spatial 
‘metaphors that people use for temporal concepts! The mathematics of this picture is 
ssiven by a set of time points, together with an ordering relation and perhaps a 
‘metric measuring the distance between two points. Section 10.5 discusses some 
objections and alternatives to this point-based paradigm. For now, consider formally 
representing time as a frame; that is, a structure T= (T, <) such that < is a binary 
relation on T, called the precedence relation. Elements of T are called time points, if 
«pair (5, £) belongs to <, sis said to be earlier than t. The remainder of this section 
discusses a number of more or less intuitive conditions that have been imposed on 
such structures to make them useful as models of time. (This section frequently uses 
first- and second-order logic for describing these propertics; the first order frame 
language used here has only one dyadic predicate symbol, which is denoted by R 
and interpreted as <.) 

Obviously, many frames will not qualify as intuitively acceptable representations of 
time. Ata minimum, one should require that < be irreflexive and transitive, Call a 
frame satisfying these conditions a flow of time. Flows of time are known from 
mathematics as strict partial orders, and, in accordance with this, familiar notation 
like s > # will be used for ‘sis later than ¢” and 5 ¢ for ‘either s= for s< #°. For a 
point 4, the set [s€ T] ¢<s} will be called the future of f; the past of tis defined 
likewise. (In the sequel, definitions pertaining to the past are omitted if they mirror 
an obvious counterpart for the future.) 

Standard candidates are given by the familiar orderings of well-known number sets: 
N= (N, <) (the natural numbers), 2=(2, <) (the integer numbers), Q=(Q, <) 
(the rational numbers) and &=(R, <) (the real numbers). Less familiar examples are 
the binary tree = (B, ~<) (where B is the set of sequences of Os and 1s, while s< ¢ 
holds if + is an initial segment of 1), and four-dimensional Minkowski spacetime 
S= (RY, <A; here, (xy, Xi, X35 #) I(x x4, x5, £”) if not only the temporal com- 
ponent of the first point is smaller than that of the second one (t < ¢’), but also the 
spatial distance between the two points should enable one to reach the one point 
from the other without having to travel faster than the speed of light. 

‘Observe that this definition excludes circular time: if there were a series of time 
points 5 <5 <-+-5,<s then by transitivity 5; <5, which is not possible since the 
flow of time is assumed to be irreflexive. Since itis not the lagician’s task to choose 
between different ontologies, why not allow circular time? Afterall, many civilizations 
have regarded time as being essentially eylic in nature. Also, practical applications of 





204 





Temporal Logic 


circular time are easily conceivable, such as the construction of rotas. The only 
reason is simply that circular time has received very little attention in the logical 
literature. 

(On the other hand, the reader may have missed one condition in the definition of 
a flow of time, namely linearity. A strict partial ordering is called linear if any two 
distinct points are related; expressed in first-order logic, the structure is to satisfy the 
sentence Vay( Rey x= yv Rex). This perspective on time is dominant in science 
and, probably for that reason, has become the standard in most people's minds; in 
particular, all of the given number examples are linear flows of time. Nevertheless, 
so-called branching-time structures such as % and S have received a lot of attention 
in the literature on temporal logic. A structure is called branching 10 the future i 
there is some point having two unrelated points in its future, and mot branching to 
the future, if, on the contrary, the fature of each point is linearly ordered. A flow of 
time is nat branching if it is neither branching to the fature nor branching to the 
past; note that this condition differs from linearity in that it does not exclude 
‘parallel’ time lines. In the literature, one often encounters the condition that flows 
of time are allowed to branch to the future, but not to the past; this condition 
reflects the idea that at any moment, the past is determined while the future is not, 
‘As shall be seen later, the logic of branching time ties up with the logic of necessity 
and possiblity, i., with alethic modal logic [see chapter 7]. The sequel of this 
section is confined to linear time, but this is not to say that the concepts to be 
defined would not make sense outside of this context. 

‘Questions concerning the boundedness of time have occupied philosophers, theo- 
Jogians and physicists for centuries, but for the logician this is generally not the most 
interesting issue. It suffices just to mention the definitions pertaining to the future: 
‘T has a fire point if it satisfies 3 y( Ray v = y), while itis called right-serial if each, 
point has a non-empty future 

‘A more fundamental choice, it seems, is that between densenestand discretencss of 
a flow of time, A linear ordering 7 is dense if between any two distinct points there 
is a third point; formally: Vay( Ray > 3s(Rxz » Rey)). Because this way of repres- 
enting time is very convenient for modeling the notion of movement, dense flows of 
time such as the orderings of the rational or the real numbers are common, How- 
ver, for computer scientists and economists, time has a very different flavor in that 
it is supposed to proceed in discrete steps: with each non-final point is associated a 
next point or immediate succesor, ic., Vay( Rey > 3e( Rez A —3u( Rew » Ru2))) 
Standard examples of discrete flows of time are given by the natural or the integer 
‘numbers. 

Density is often confised with continuity. Suppose that the set of rational numbers 
is cut into a left and a right half, of numbers smaller and bigger than V2, respect- 
ively. Such a cut, without a proper point on either edge, is called a gap, and a flow 
of time is called continuous if it has 00 gaps. Qthus forms the standard counterexample, 
whereas and Z are continuous. 

Unlike the properties discussed before, continuity is essentially a second-order 
notion, its definition necessarily involving a quantification over sets of time points 
‘There are many other interesting second-order conditions that one may impose on 
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temporal structures. For example, one might argue that abstract time exists inde- 
pendently of the events “filling it,” and that therefore, the structure of time should 
be ‘the same everywhere.” One way of making this precise is to demand that a flow 
of time is bomagencous: for any two points sand t of , there should be an automor- 
phism of T (that is, a bijection f from T onto T satisfying x<y iff (if and only if) 
ls) < fl) for all x and yin T) mapping sto #. Another second-order property that 
is met further on is that of having finite intervals, this means that there can be at 
‘most finitely many points between any two points, Observe that this condition 
implies discreteness of both < and >. 


10.3. Basic Temporal Logic 


This section shows temporal logic at work. That is, it presents Prior's basic system of 
temporal logic, and discusses some of the fundamental logical questions pertaining 


tot, 


103.1. Syntax and semantics 


To define the syntax and semantics of temporal logic, one should first note that 
temporal logic is an extension of classical propositional logic, Recall that classically, 
propositional formulas are interpreted as truth values (either 1 for ‘true’ or 0 for 
“false’); this truth value is inductively determined by a valuation: a function mapping 
propositional variables to truth values. Once the valuation is known, the truth value 
‘of any formula is fixed, Now what to do with the fact that the truth value of 
statements like 


It is raining, 
or 
1am carrying an open umbrella, 


will change from time to time? For instance, it may be raining today but sunny 
‘tomorrow; or, I may be carrying my umbrella up now but fold it some time after the 
tain stops. 

‘The first basic idea underlying temporal logic is to make valuations time- 
dependent; more precisely, one associates a separate valuation with each point of a 
‘sven flow of time, Formally, let T= (7, <) be a flow of time; a raduation on T is a 
map : (T+ (> [0, 1])). (@ denotes the set of propositional variables.) A model 
is a pair (= (7, n) consisting of a flow of time and a valuation. 

Observe that, with this definition, one can already interpret classical formulas in 
cach point of a model in a standard way. For instance, the formula g  — is said to 
be true at a time point + precisely if (#}(p)=1 and x(2)(9)=0. The spice of 
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temporal logic, however, lies in its second basic idea, namely to use new, non- 
classical connectives to relate the truth of formulas in possibly distinc time points. 
This section discusses two such operators: Fand P. These names are mnemonics for 
‘furure? and ‘past’ respectively: the intended meaning of the formula “Fo is ‘at some 
time in the future, @is the case,” while “Py? is to be read as "at some time in the past, 
@ holds.” 

Formally, define the set £, of Priorcan formulas a the smallest set containing the 
propositional variables that is closed under constructing new formulas using the 
Boolean connectives -> and , and the temporal operators G and H. For technical 
reasons, take Fand P to be defined operators in this set-up; Fe abbreviates Gp 
and Pp abbreviates H-@. Ge and Hep are read as “henceforth, @* and ‘hitherto, 9," 
respectively. As further abbreviations, use 1, T, a,» and ¢+ in their usual meaning. 
Note also that the mirror image of a formula is simply the formula one obtains by 
simultaneously replacing all Hs with Gs and vice versa, 

Bringing the previous observations together, gives the following inductive defini- 
tion of the notion of truth of a formula @ ata time point fin a model f= (T, <, 





Mtg if m(nKg)=1 
MtikK—p if not at Fk @ 

Mr eay — ifAG Fit @and AG rh yw 

Mt Ge if M6 sit @ forall swith #< 5 

IM tik He if 6 sk @ for all swith ¢>s (0.1) 


IF-96, #1 @ then @ is said to hold 0 be true at ¢. 

AAs an example, consider the ordering X of the natural numbers; let t be the 
valuation making 4 true at all numbers bigger than 1000, and r at all even. num- 
bers. With this valuation, it is easy to see that the formula FGg holds at the point 0. 
For, the formula Gy holds at those points of which the future is a subset of the 
set of “gpoints,” and this is the case for any number bigger than 999. But from 
M, 1000 i+ Ge and 0 < 1000 it follows that 9% 0+ FGg. It is likewise easy to see 
that the formula FGr does not hold at 0, of indeed, at any point in this model; 
the formula GFr on the other hand holds throughout 

Finally, observe that from the technical point of view, this system is very similar to 
systems of alethic modal logic [see chapter 7] since Gand H are very much like the 
necessity operator L. The difference is that in Priorean temporal logic there are fe 
modal operators instead of one. One might then expect that one would interpret 
this language in structures with ree accessibility relations, say, Ry and Ry. And, in 
fact, itis possible to adopt a perspective in which one sees < and > as these two 
distinct accessibility relations; however, itis a crucial aspect of temporal logic that 
these two accessibility relations are each other’s converse. The main distinction 
between alethic modal logic and temporal logic is thus one of aim: temporal logic 
starts with structures (flows of time), for which one is trying to find good modal 
description languages; whereas, in alethic modal logic, it has more often been the 
other way around. 
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103.2. Validity and definability 


‘Temporal logicians are generally not so much interested in the truth or falsity of 
formulas in specific models, but rather in those formulas that remain truc through- 
cout the flow of time even if the valuation is changed. It is felt that such formulas 
provide essential information conceming the structure of the underlying flow of 
time. Formally, a formula @ is said to be valid on a flow of time T (notation: Tt ) 
if for every valuation x on , and every point of 7, (7, ), #1 @. A formula is valid in 
a class of flows of time if it is valid on each member of the class. The notion of 
satisfiability is defined dually: a formula @ is said to be sarisfiable in a flow of time 
(a class of flows of time) if its negation is not valid on the flow of time (in the class 
of flows of time, respectively). 

‘As an example, it can be shown that the formula Fy —> FFy is valid on the class of 
dense linear orderings. Assume that Tis a dense linear flow of time; to show that 
F4—> FFy holds on it, consider an arbitrary valuation x on ‘, and an arbitrary point 
rin such that (7, 2), £1 Fg. By the truth definition, there is a later point s where 
‘7 holds. But by density, there must be some point m between and 5; from <5 
Fy holds at x; but then £<w implies that FFy holds at 4; since 1 and x were 
arbitrary, this suffices to show that ‘T+ Fy —> Fy. 

‘On the other hand, itis easy to see that the formula Fy —+> FFyis oot valid on the 
‘ordering of the integers. For, take the points 0 and 1 and consider the valuation 
that makes g truc only at 1; then obviously, Fy is true at 0; but since there is no 
integer number between O and 1, the formula FF cannot be true at 0. This shows 
that indeed 2! Fg > Fy. It is possible, in fact, to generalize this argument to show 
that the formula Fg —> FFg can be falsified on every non-dense frame. For, any non- 
dense frame must contain two points s<r without intermediate points; $0 the 
valuation making 4 true only at r will make the formula Fy-—> FF false at s. Hence, 
the formula Fy —> FFg is very informative; it isa reliable witness of the density of a 
flow of time 

In general, a Priorean formula 9 is said to define a class C of flows of time within 
4 class K if for every flow of time in K, ‘Tk @ iff T belongs to C. If C is given as 
the class of frames satisfying some first-order property a, @is also said to correspond 
to a (within K). For instance, as has just been seen, the formula Fy —» FFy corre- 
sponds to densi 

Not every property of flows of time is definable; for instance, one can prove that 
there is no Priorean formula that defines the class of branching flows of time. On the 
other hand, there isa formula defining the flows of time that are not branching; for, 
the formula PFq—» (Pav qv Fa) corresponds to non-branchingness to the future 
Hence, the conjunction of this formula and its mirror image defines the flows of 
time that are not branching. 

Especially if flows of time confined to linear orderings, many interesting properties 
can be defined in the Priorean language. In this lst, a number of such correspond: 
ences holding for linear flows of time are given, together with the names of the modal 
formulas. Here, Op abbreviates Pov gv Fp, and Og = Hen on Ge. 
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Having a first point Hiv PHL (ab 
Left-seriality Pr (a2) 
Having a final point == Gv FGL (a3) 
Right sriality Fr (as 
Disereteness (ET Aga He) Fy (as) 
Density Fy Fy (a6) 
Continuity (Fg Ong 0q> Hp) > 

O((qn Gra) v (4.5 He) (a7) 
Having finite intervals (G(Gy—> 4) + (FG4-» Gy) » 

(H(Hq—> 9) (PHq > Hp) (a8) 


Finally, since Priorean formulas may be interpreted 00 all frames (also ones that 
are not strictly partial orders), the question naturally arises whether the class of flows 
Of time itself is definable. Since, analogous to the case of ordinary modal logic, 
transitivity may be defined by the formula Gp -» GGp, this reduces to the problem 
of finding a correspondent for ieeflexivity (within the class of transitive frames), 
Unfortunately, there is me such formula. 


10.3.3. Axiomatics 


‘As mentioned already, temporal logic starts with flows of time; but obviously, this 
does not diminish the interest in finding complete calculi for various classes of flows 
(of time. Obviously, there are close connections with the axiomatic of alethic modal 
logic [see chapter 7}. In particular, analogous to K, there is a minimal temporal logic 
for the Priorean language as wel; it is called K, and defined as the smallest class of 
Priorean formulas that is closed under the following axioms and derivation rules: 


(CT) all clasical propositional tautologies 
(DB) Gg) (Gq Gr) 


Hq) + (Hg—> Hr) (Diseribution) 
(CY) 49GPq 

10H (Comverse) 
(4) G4 G64 (Transitivity) 


(US) if pis a theorem, then so is @[y/4] (Uniform substitution) 
(MP) if pand 9 ware theorems, then (Modus poncns) 
soisy 


(TG) if @is a theorem, then so are Ge (Temporal generalization) 
and He 





209 





Yde Venema 


Here 9{v/4] denotes the result of substituting the formula w for the propositional 
variable g, uniformly throughout @. 

Most of these axioms and all of these rules are, perhaps under different names, 
familiar from ordinary modal logic. The exception is the Converse axiom (CV); a8 
will be seen, this axiom is needed to ensure that the accessibility relations for the 
operators Gand H are each other's converse. The formula (4) reflects the transitivity 
Of the intended accessibility relation of a modal operator [see chapter 7]; thus, our 
constraints on flows of time explain its presence as an axiom. Recall from the previous 
subsection that the property of being irrelexive is not definable in the Priorean 
language; now notice that irrelexivity does not even yield any extra valdities. (This 
isnot the rule in modal logics: frame conditions that are not definable in the modal 
language may nevertheless imply the validity of modal formulas.) 


Theorem 10.1 The logic K, is sound and complete with respect to the class of 
all flows of time. 


For lack of space, the proof of Theorem 10.1 is omitted. Instead, this section 
concentrates on completeness for the class of linear flows of time, Let Lin be the 
‘extension of K, with the axiom (NB), which is the conjunction of the axiom 
PF4—> (Pq qv Fa) (defining non-branching to the future) and its mirror image 
Pq —> (Fav av Pa), 


‘Theorem 10.2 The logic Lin is sound and complete with respect to the class of 
linea flows of time. 


Proof, This proof method makes use of familiar canonical models {see chapter 7). 
Let W* be the set of maximal Lin-consistent sets of formulas, and define the relation 
Ron W" by Rowe iff @E » for all Gp € w. The structure F=(W", R’) is called the 





‘canonical frame, on it, define the canonical valuation x* so that x'(q\w)=1 if 
pew. 
‘The first aim is to prove a Truth Lemma for this model, stating that for all 





Priorean formulas and every point w of the canonical model = (3%, x) ‘truth 
coincides with membership’ 


a, wih 9 iff pew (0.2) 


AAs usual, (10.2) is proved by formula induction. There is only one minor problem, 
caused by the fact that now there are two modal operators, and only the one 
accessibility relation. This is precisely where the Converse axioms come in: they 
make it possible to show that the canonical accessibility relation not only works well 
for G but also for H. For, it can be proved (details are left to the reader) that R'we 
iff @ E w for all He v. 

‘Now it follows easily from (10.2) that every Lin-consistent set of formulas is 
satisfable in the canonical model, but unlike the case of modal logics like $4 it does 
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not finish here. It is important to satisfy the Lin-consistent set of formulas in a linear 
flow of time. Now it is easy to verify that the canonical accessibility relation is 
‘transitive (use the axiom (4), asin modal completeness proofs); it is not very difficult 
to show that R'is not branching (the details of this proof are left ro the reader ~ use 
the axiom (NB)); but it is impossible to prove that R’ is a linear ordering, because 
in general this will not be true! The main problem is that nothing guarantees 
inveflexivty of canonical accessibility relation. The difficult part of the proof consists 
in showing that itis possible to rransform the canonical frame into a strict linear 
‘order, while truth of formulas is preserved, 

‘A frame ¥ =(W, R) is called a pueudo-line if Ris transitive and strongly connected, 
i.e, satisiying Vay( Rey v x= y¥ Rox). Now given any maximal Lin-consistent set E, 
it is possible to restrict consideration to the part of the canonical frame that is 
connected (via R') to E and still prove the analogue of the Truth Lemma (10.2). It 
thus follows that every consistent formula is satisfiable in a pseudo-line. But then 
the missing link in the proof of the completeness theorem for Lin is the following. 
claim, 








If @ is satisiable on a pseudo-line, then also on a linear flow of time. (10.3) 


‘To prove claims like (10.3), several methods of frame surgery” have been developed 
to give the reader an idea of such techniques, a brief sketch of the bulidosing 
method is given here. Assume that @ is satisfiable in the model M¢= (1, x) based on 
the pscudorline = (W, R). The first observation is that F-may be represented as a 
linear ordering < of so-called clusters which are special subsets of W. Each point s of 
W belongs to a unique cluster C which is either degenerate (consisting of a single 
irreflexive point) or proper (if R is universal on it). The relation ~ is defined such 
that C,< Giff G# G, and Ret 

‘The key idea is now to “bulldoze” each proper cluster into a special linear ordering 
¢ and to replace each C with £-. Obviously, replacing each proper cluster with 
4 linearly ordered model yields a linear order; but is ¢ still satisfiable in the new 
‘model? To understand the positive answer to this question, note that any proper 
cluster introduces an infinity of information recurrence in both the forward and 
backward directions: one can follow paths within C, moving either forwards and 
backwards along R, for as long as one pleases, Thus, when a cluster C is replaced 
with a linear ordering, it is important to ensure that the linear ordering duplicates 
all the information in C infinitely often, and in both directions. Bulldozing does 
precisely this, in the most straightforward way possible. For instance, suppose that 
the cluster Chas three elements only: 5; and J, with associated classical valuations 
Gy, 6 and 0. Then Le is given as the model (2, fc}; here Re is given by Re(2) = 
oma s that i, Ce consists of an unbounded (in both directions) series of points with 
associated classical valuations , 6; and 05,38 iN. 6,000.0. 

‘There is thus an obvious relation linking points in the new, transformed model to 
points in the old one; using this one may prove that @ is indeed satisfiable in the new 
‘model. This finishes the proof sketch of (10.3). QED 

‘Turning to the axiomatics of specific structures, let us define the following logics: 
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Lin.N: Lin +Al +A4+A8 
Lin.Z: Lin +42 +A4 +A8 
Lin.Q: Lin +A2 +44 +46 


Lin.R: Lin +A2 +44 +A6 +47 
For these logics, the following result applics. 


‘Theorem 10.3 The logics Lin.N, Lin.Z, Lin.Q and Lin.R are sound and 
‘complete axiomatizations of the set of valdities of the flows of time X, 2, Qand 
R, respectively 


One may conclude that temporal logicians have been rather successfl in 
axiomatizing the standard flows of times and the most natural classes of flows of 
time, Nevertheless, it would be wrong to conclude that conversely, (axiomatically 
defined) tense logics are always characterized by a class of flows of time, As in modal 
logic, incompleteness is the rule; in fact, the very first example of an incomplete 
(poly:)modal logic was found in tense logic. 


10.3.4. Decidability and complexity 


‘The completeness theorems mentioned in the previous subsection are important, of 
course, but if one wants to do actual reasoning in one of these logics, further 
properties are required, Minimally, one wants the logic to be decidable; ic., the 
‘existence is required of a terminating algorithm separating the logic’s theorems from 
its non-theorems. Fortunately, all the complete logics defined in the previous sub- 
section have this property. More explicitly: 


Theorem 10.4 The Priorean tense logics of the classes of all lows of time, and 
of all linear flows of time, are decidable. 


‘This follows from the fact that these logics are finitely axiomatizable and have the 
finite model property. The latter may be proved through the method of fitrations oF 
the method of mini-canonical models [see chapter 7], with allowance for complexities 
analogous to the proof of completeness for Lin. 

For practical purposes, decidability is not enough, however; one would like to 
have an efficent calculus. A more fine-grained analysis is nceded to reveal the com= 
‘putational complexity of temporal logics. There is not enough space to go into details 
here, but only mention the result that the satisfiability problem for linear time is in 
NP. To be more precise, one can devise a non-deterministic Turing machine algo- 
rithm that correctly tells whether a Priorean formula @ is satisfiable in a linear frame 
‘oF not, while arriving at this answer within f() computation steps. Here fis a linear 
function that grows at the same rate as the length of the formula @. 
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10.4, Extending the Language 


‘The discussion in section 10.3 was based on the basic temporal language having G 
and H as its only primitive operators. For many applications, however, this language 
is too poor in expressivity, and several extensions with new operators have been 
suggested. This section examines some of the most important of these, especially the 
terms ‘since’ and ‘until,’ and also operators for branching time structures. 


1041. Since and until 


Hans Kamp introduced two dyadic operators, Sand U, with the intended meaning 
of since’ and ‘undil’, a in the sentences 





Ever since the roof caved in, it's been wet in the house. 
and. 
‘Until we get the roof fixed, it will be damp in the house. 


Let £, denote the extension of the Priorean language with these two new connectives, 
the formal truth definition of which is given as follows. 


M, tk Upy if at sit @ for some s such that ¢<5 
and 36 wi y for all w with #< <5 


AM, th Sew if AG sit @ for some s such that 5<¢ 
and 96 al y for all w with s<w <r 


It is interesting to observe that the “old” operators can be expressed in this new 
language, for instance Fp may be seen to abbreviate UgT. But conversely, the new 
‘operators really add expressive power to the language; it can be proved that they 
cannot be defined in terms of the old 

Another interesting temporal operator isthe so-called nexstime ot tomorrow operator 
LX; the formula X¢ holds at a time point £ if @ holds at the next moment in time (if 
there is such a next moment). Obviously, such an operator only makes sense in a 
discrete flow of time, as, for instance, in computer science, where one wants to talk 
about the next state of a process. However, adding this new connective to £, would 
not add any expressive power, since X@ can already be defined as an abbreviation for 
UeL. 

“This raises the question whether perhaps every temporal operator can be defined 
in this apparently expressive language £_. The answer to this question is positive; 
is possible to prove some sort of functional completeness result for L 
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‘Theorem 10.5 (Kamp) Over the class of linear, continuous orderings, every 
temporal operator can be defined in 





Note: By ‘temporal operator’ is meant any operator whose truth definition is 
expressible in first-order logic. The restriction in the theorem to certain flows of 
time is essential. In particular, once the condition of linearity is dropped, the results 
tend to be negative; for instance, over the class of all lows of time it is not possible 
to find a finite expressively complete set of operators. 

Finally, for the language £,. one can ask the same kind of questions as for £,; and 
indeed, several results have been proved concerning definability, axiomatizability 
and decidability. In general, these results are positive, but there is not enough space 
to give details here. 


10.42. Branching time languages 


‘As mentioned above, allowing flows of time that branch to the future means that 
‘one can no longer assume that the past determines everything that is going to 
happen. Bur if the formalism has to take into account that there are many different 
courses of events possible, it seems appropriate to pay somewhat more attention to 
the truth definition of the future operator F. For, the intuitive meaning of Fe, 
namely ‘it will be the case that 9,’ is now more ambiguous than in linear flows 
Cf time, Recall that the interpretation of Fp that can be calculated from the truth 
definition (10.1) yields “p holds at some future moment of some possible course of 
events.” But it does not seem to be unreasonable to assume that ‘it will be the case 
that @” expresses the speaker's conviction that g will be the cas, in the actual course 
of events, oF pethaps no matter what course of events, These two interpretations 
‘ive rise to respectively the Ockhamist and Peircean schools in branching time logic. 
To compare these two approaches, assume our flows of time to be tree, i, 
connected strict partial orders that do not branch to the past. (Connectedness 
forbids, for instance, parallel time lines.) A branch of a tree = (T, <) is a maximal 
linearly ordered subset of T; the intuitive idea is that each branch through ¢ repre- 
sents a possible course of events (for a point ¢ and a branch 6, tis said to lie on b or 
that b,goes through + if # belongs to 6). In this way, one can imagine a posible future 
Of fas the set ofall Jater points on some fixed branch & through f; since Tis a tree, 
each point will have a unique past 

Now Peitcean branching time logic interprets the proposition “it will be the case 
that g in the second way indicated above, namely that @ is bound to happen in 
every possible future. To make this more precise, define the Peircean tense language 
as the extension of the Priorean one with the future operator F:; this operator has a 
second-order definition, involving a quantification over all branches through the 
actual time point: 





6, t+ F.@ if for every branch passing through + 
there is some s> t with % sik @ (10.4) 
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In the Ockhamist approach on the other hand, it is meaningless to ask about 
the truth value of formulas of the form Fe or Go at a time point ¢, unless one bas 
specified which of the possible futures of ¢ one has in mind, To be able to express 
that something that will be the case no matter what form the future will take, 
‘Ockhamists extend the language with an alethic modal operator O. Ockhamist 
temporal logic is thus an interesting combination of modal and temporal logic; 
pethaps the easiest way to work out the idea formally, isto require that in Ockhamist 
semantics the truth value of any formula is evaluated at a pair consisting of a time 
point and a branch through this point (representing the actual course of events) 
This leads to the following truth definition 


Mt bik if m(Xg)=2 
Mt bie if nor as bE @ 

Mh bIKeAY if 9G 4, bp and MG 1, bY 

Mf, bIKGe if. 5, bik @ for all son bwith <5 

Mf biKH@ if 5, be @ forall son 6 with ¢>s 

Mf be if 96 & elk @ forall branches ¢ through £ (105) 


It is interesting to note that the Peircean language can be secn as a fragment of 
the Ockhamist one; consider the inductively defined translation (-)° mapping Peircean 
formulas to Ockhamist ones. The only non-trivial clause of this map concems the 
future operators: 


(Key'=OFe* and (G9) = G9" 





It is straightforward to prove that, for all tree models 94 all points ¢ in 4 and all 
branches # through 


6, tik @ if MG F, bE 


Many results are known conceming Peitcean and Ockhamist logic; for instance, 
axiomatizations have been found for the Peircean logic of the clas of all trees. This 
logic is also known to be decidable, as is its Ockhamist alternative. It is an open 
problem to find an explicit axiomatization for the Ockhamist tree logic 

Finally it is obvious that one can extend these branching time logics even further, 
for instance with the Since and Until operators defined earlier. The “future frag: 
ment’ of such systems is closely related to so-called computational tree lagics that 
have been developed within theoretical computer science for the purpose of reason 
ing about paths through labeled transition systems, which in their turn form perhaps 
the simplest mathematical models of the notion of computation. It is interesting to 
note that the Peircean and the Ockhamist approaches in philosophical logic find 
(anuch more technically inspired) counterparts in the development of the computa- 
tional tree logics: CTL and CTL*, respectively 
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10.5. Time Periods 


So far time has been represented by a point-based paradigm. Nevertheless, it seems 
that, in every field where temporal logics are used or studied, at a certain moment 
systems are designed in which periods, rather than points, are the central entities, or 
at least, play a more prominent rote. 


10.5.1. Motivations 


‘The point-based perspective has never been without philosophical objections, For 
instance, Zeno’s paradox of the flying arrow, which, it is argued, cannot change 
position at a isolated moment of time and thus cannot move at all, makes it clear 
that there is something problematic about representing time asa series of durationless 
‘moments if one wants to describe the concept of movement. Some temporal pre- 
dicates seem simply not to apply to time points. Suppose that p is a proposition 
formalizing the statement that Zeno's arrow moves. Obviously, the flying of an 
arrow is an activity that is extended in time; hence, one might argue that it is 
pointless to evaluate the truth of p at moments of time, It thus seems that at least, 
‘one needs the existence of time periods for the evaluation of certain expressions. 
‘Apart from such semantic considerations i is clear that time points are not the kind 
‘of objects that one can directly perceive. Due to years of exposure to the scientific view 
fn time it may not be possible always to realize this, but if one wants to base reality 
‘on direct experience, then time points will come out as highly abstract and complex 
artifacts, Thus, it has been argued, it is a dubious enterprise to take points as having, 
primitive ontological satus; periods form a far sounder base. This second argument 
has been taken up, with a more practical owist, within AI. Here the idea has been 
advocated that period-based representations of time are simpler and more natural in 
formalizing common sense reasoning than the standard scientific models. (Obviously, 
this argument may be pushed farther, questioning the Newtonian perspective in 
Which absolute Time exists regardless of anything happening init. Such objections may 
lead to event-based ontologies which due to lack of space cannot be discussed here.) 
ly, in our discussion until now it has been assumed that there is a clear and 
intuitive distinction between points and periods. This is questionable as well, how 
‘ever; one can quite convincingly argue that there is a notion of granularity involved 
here. A good example can be taken from computer science, where the multiplication 
‘of two numbers may be taken as an atomic, durationless action of a high-level 
programming language, whereas it may be implemented in terms of many operations 
(on the lower level of the machine language. 





10.5.2. Time in periods 


It is important to observe that the need for a more prominent role of periods does 
not necessarily commit one to model time in structures in which periods are pri 
rive entities; they might as well be derived objects. 
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Indeed, one could well start from a flow of time T= (T, <) as described cartier, 
and then consider the question how to represent chunks of time within such a 
structure. For instance, periods could be defined as convex sets: subsets C of T that 
are uninterrupted in the sense that whenever s and a later point £ belong to C, then 
s0 does any point between s and ¢. A set-theoretically slightly simpler option is to 
‘only consider (closed) intervals; in this approach, the period (w € T| s= w= #} can 
simply be represented as the prir [st]. Observe that this approach has the advan 
tage that properties of periods can be expressed by binary predicates in the fist: 
order frame language, whereas for convex sets some kind of higher-order logic has 
to be used [see chapter 2] 

If one opts for periods as primitive entities, the simplest mathematical modeling 
will involve structures consisting of a set P of periods equipped with a collection of 
natural relations on P. But, in contrast to the point-based approach where the 
temporal precedence relation is the candidate for such a relation, there are now many 
‘options. For instance, since one is obviously sil interested in temporal precedence, 
the relation <, with p< holding if the entire period p precedes the entire period 
4 i8 a natural candidate, but so is the inclusion relation C, with pC q holding if p 
is a proper part of g. And in fact, one widespread period-based modeling of time 
is that in structures of the form P=(P, <,C). But < and © are not the only 
candidates, If one is interested in relations that are close to common sense experi- 
cence, then the relation of one period overlapping with another is quite relevant as 
‘well. And this need not be confined to binary relations at all: unary predicate may 
be needed to determine whether a period is of zero duration (and hence, point-like), 
‘whereas there are also interesting temary relations such as the relation C holding of 
a triple p, 4, rif p can be ‘chopped’ into the two pieces 4 and r. Of course, just like 
in the point-based case, one needs to impose conditions on period structures to 
make them useful as models of time. For instance, in a structure of the kind 
Pa(P, <,€) one will want < and C to be strict partial orders that are related by 
conditions like Vaye(x y <2» x <2) and others. 

‘The reader may have realized how hard it is to gather one’s intuitions and make 
4 complete list of such conditions without taking resort to talking about points 
after all. The concept of a point in time has obviously been very useful in our 
thinking about time. Hence, even if periods are to be taken as the primitive entities 
of one’s ontology, it is at least interesting, if not a test for the viability of the 
proposal, to see whether one can construct point: based flows of time from period 
structures. Various ways have been worked out for this purpose. Perhaps the sim- 
plest method is to take as points those periods that have zero duration ~ of course, 
this only works if such entities are around and there is access to this information 
(for instance, through a zero-duration predicate as mentioned above). But even if 
the period structure does not have atomic periods, there are ways to extract a point 
structure fom it, for instance, by defining a point to be any maximal set of mutually 
overlapping pairs of periods. Finally, once there are ways to construct point struc- 
tures from period structures and vice versa, the obvious question is to see how such 
constructions interact. This line of research has been taken up with great mathe- 
matical sophistication, in a number of cases even leading to interesting categorical 
dualities. 
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10.5.3. Interval-based temporal logic 


Just as in the case for point-based temporal logics, one may choose a class of period 
structures, design a formal language to talk about it, and study the resulting temporal 
for nsane, magne working wth ner in poe based Now of time a 
scribed above. Taking the modal approach, one is presented with a multidimensional 
setting; ic., one wants to evaluate formulas at pairs of points representing the 
beginning and the end point of the interval, respectively. Typical modal operators 
are (D) and * with rules of truth given by 





MG Ls, HKD if M6 [a PIF @ for some f, w with sus 7s Fr 
M [x tik poy if [4 mw] pand a6 [m, #] IF y for some w with ss vst 


In words, (Dyp holds at an interval if @ holds at some interval during it, while @> y 
holds at an interval if it can be chopped into a g- and a y-part. In period terms, one 
‘would say that © and C are the accessibility relations of (D) and °, respectively 
For such modal systems, one may investigate meta-logical properties like com: 
pleteness and decidability. The general picture here is that one has a price to pay for 
the increase in expressivity: complete axiomatizations are scarce and hard to find, 
and undecidability is the rule rather than the exception. On a technical level, the 
‘modal logic of time periods thus seems to be more complex (and hence, more 
intriguing) than point logics over the same flows of time, but the kinds of questions 
that are asked do not differ much 

Hence, to finish, this section mentions some issues that are of specific interest to 
petiod logics. To start with, period logics differ from point logics in the sense that, 
in many cases, it is natural to correlate the interpretation of atomic propositions. 
‘A condition that one often encounters is that of homagencity requiring that an 
atomic proposition holds at a period iff it holds at each of its parts, It is obvious that 
such a condition only has intuitive appeal for the propositions corresponding to the 
event categories of states and activities. And even in the latter case, one may raise 
“objections to the ‘only if* part of this condition: I can truthfully say that I have been 
walking through town for hours when in fact, T have paused a couple of times to 
take a coffee. 

‘Now suppose that this condition is being implemented on some interval structure 
1(7) induced by the flow of time T by demanding that for cach propositional vari- 
able p and each point-based valuation we have 


(UCT), ), bs, #1 pitt (eX p) = 





for all w with ssu=¢ (10.6) 


‘Observe that thus we have effectively reduced period predicates to point pre- 
dicates. Such a reduction would have considerable computational advantages, some- 
thing that can easly be explained by taking a first-order perspective. It is obvious 
that the particular proposal (10.6) is rather naive: Zeno’s moving arrow will ead 
‘one into trouble. But perhaps there are more inventive modelings in which formulas 
can be evaluated at periods, while the valuations remain point based? 
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In any case, regardless of the technical advantages of reducing period predicates to 
point predicates, it is clear that there is a rather general philosophical issue at stake 
here, namely the problem of which kinds of predicates apply to periods and points, 
respectively, and how these are correlated. This issue isin fact a matter of ongoing, 
and at times heated, debate. 


10.6, Temporal Logic Now 


‘As mentioned before, temporal logic has become a vast and active research area with 
applications in many disciplines. This section briefly sketches some of these recent 
developments. Since not all of the work mentioned here is covered by the mono- 
graphs mentioned in the Suggested Further Reading, references are provided to the 
literature. 


10.6.1. Richer ontolagical structures 


‘One common trend in temporal logic is to study logics of richer ontological struc- 
tures since it is obvious that for serious real-world applications the kind of temporal 
logics that have been described so far are too simple. For example, one shortcoming, 
of standard temporal logics is that they only deal with qualitative timing properties, 
whence they are inadequate for applications such as reasoning about real-time behavior 
of software. To overcome this deficiency, people have designed logics for describing, 
two sorted structures consisting of a linear flow of time connected with some metric 
domain. Such approaches can be found both in the point-based (Montanari and de 
Rijke, 1997) and in the period-based (Hansen and Chaochen, 1997) paradigm. 
Another example of a multiple sorted ontology has already been met in the semantics 
‘of Ockhamist branching time logic, where branches appeared as a second kind of 
entities, next t0 points. One might vary on this ‘standard’ Ockhamist logic by 
admitting only some instead of all branches, perhaps a collection satisfying some 
addition constraints (Zanardo, 1996). Applying this idea of using multiple sorted 
temporal ontologies to the discussion of the previous section, one can envisage 
structures in which points, periods and events co-exist, linked by suitable relations 
(Gardent et al, 1994). One possibility for such a link involves the notion of granu- 
larigy: atomic objects might suddenly turn out to be divisible when approached at a 
different level. This obviously ties up with the way of clasiffing periods of time 
(months, weeks, days); modal logics for such layered structures are described by 
Montanari (1996). 





10.6.2. Temporal logic at work 


‘Turning temporal logics into actual working systems has created a number of 
interesting, problems and challenges. For instance, one of the most fundamental 
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contributions that AI has made to the ficld of temporal logic, is that of identifying 
the frame prablem. This is the problem of formalizing the properties of an applica- 
tion area that are unaffected by the performance of some action without explicitly 
summing up ail such propertics. This problem appears to be independent of the 
particular formalism employed, and has to be faced by anyone wishing to give a 
formal account of reasoning about change (Sandewall and Shoham, 1995). The 
computer science literature on modal logics of time has yielded an interesting, per 
spective on the modal truth relation (A% rit g) between a model 4 which is 
supposed to be finite, and a formula ¢; in this perspective @ represents some prop- 
erty of a program and 3 some implementation of the program. For obvious reasons 
then, a considerable amount of effort has been devoted to finding fast model check- 
ing algorithms deciding whether a given formula holds in a given finite model 
(Stirling, 1999). As a last example, notice the dynamic turn which research in the 
semantics of natural language has taken. In this way of thinking, the meaning of a 
formula does not lie so much in its truth condition; linguistic expressions are rather 
like programs that update the information state of some agent. For instance, in 
Discourse Representation Theory (Kamp and Reyle, 1993) [see chapter 20] temporal 
‘expressions in natural language are used to extend and refine temporal representations 
Of the discourse; these representations in their turn are syntactic items themselves 
that can be interpreted in standard models 


10.6.3. Temporal legic in context 


“There is an increasing tendency to study modal formalisms not as isolated systems 
‘but in connection with other branches of logic, asin Correspondence Theory which 
relates modal logic to fist- and second-order logic. For instance, the use of game: 
theoretic methoxls has decpened our understanding of the relative expressive power 
‘of modal logics of time: im particular, variants of Ehrenfeucht-Fraissé games 
have provided an interesting perspective on expressive completeness results such as 
‘Theorem 10.5 (Immerman and Kozen, 1987; Venema, 1990). Recent approaches 
to decidabilty questions concerning, modal and temporal logics use insights from 
algebraic logic and automata theory. This has led to the identification of a variety of 
decidable fragments of first-order logic, each of which is obtained from atomic 
forrnulas using all Boolean connectives but allowing only a specific, guarded pattern 
‘of quantification (Andréka et al., 1998). Finally, notice the emergence of so-called 
hybrid languages which aim to boost the expressive power of modal languages by 
adding some features from fest-order logic like ‘names (special variables that are to 
be true at a single state), over which quantification is allowed (Blackburn and 
Tzakova, 1999; Goranko, 1994). 
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10.7. Epilogue 


‘What then, is temporal logic? 
In the narrowest sense, temporal logic comprises the design and study of specific 
systems for representing and reasoning about time, such as Prior’s tense logic. These 
enterprises may have both an applied and a theoretical side, the former consisting of 
designing a system (i.c., making choices in the fields of ontology, syntax and seman- 
tics), formalizing temporal phenomena in it, and then putting it to work (perhaps 
through implementing it). On the theoretical side, one aims to prove formal properties 
of the system, such as completeness or decidability 

On a slightly wider scale, temporal logicians may thus provide a supply of general 
tools and techniques for answering questions pertaining to specific systems, As an 
‘example, note the method of filtration which is a quite general method of proving, 
decidability of a temporal logic, and the canonical model method which is very 
‘useful in proving completeness results. 

‘A more ambitious aim for temporal logicians is to devise frameworks for com: 
paring and connecting different modelings of time, This aim can be realized both 
at a technical and at a philosophical level. As an example of the first, think of 
the game-theoretic analysis of the expressive power of modal languages, or of the 
duality between point and period: based representations of time, respectively. On. 
4 philosophical level, a thorough classification of event types and of the correla: 
tion between predicates pertaining to points and to periods, respectively, would be 
an extremely useful tool in any discussion on formal representations of temporal 
Phenomena, 

Since all of this is relevant for each of the disciplines where formal reasoning, 
about time is needed, temporal logic forms a prime example of the growing role of 
logic as a source and channel of ideas and techniques applicable in related 
ciplines, Ultimately, one would hope that temporal logic can provide a unifying 
perspective on our sometimes confusing thoughts about this highly puzzling thing 
we call time, 









‘Suggested further reading. 


“This chapter has only scratched the surface of temporal logic. The following monographs, 
cach surveying part of the field of temporal logic, would form a good star for a bibliography. 
‘Concerning the philosophy of time, I do not believe there is one standard reference, but 
Whitrow (1980) offers a very comprehensive study of the concept of time, while Le Poidevin 
and MacBeath (1993) bring together some seminal articles on the subject. Ohstrom and 
Hisle (1995) give a good treatment of philosophical aspects of temporal logic from a histor 
ial perspective. Goldblatt (1987) provides the reader with a concise and very accesible 
treatment of the most important modal logics of time; Gabbay ct al. (1994) give 2 more 
extensive mathematical treatment. Manna and Pruck (1991) have produced a classic on 
applications of temporal logic in computer science; Gabbay etal. (1995) give a good overview 
‘of the applications of temporal logic in AI. There scems to be no monograph on the treatment 
in formal linguistics of temporal aspects of natural language, but Steedman (1997) surveys the 
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field well. Van Benthem (1991) offers stimulating blend of much of the above. Finally, for 
an overview of recent developments in temporal logic, the reader is referred to the proceedings 
‘of the first two conferences devoted solely to temporal logic, ICTI.'94 (Gabbay and Ohibach, 
1994) and ICTLV7 (Barringer et al, 1999) 
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Chapter 11 
Intuitionistic Logic 
Dirk van Dalen 


11.1. Basic Principles 


‘There are basically two ways to view intuitionistic logic: a8 a philosophical-foundational 
issue in mathematics; or as a technical discipline within mathematical logic. Consid- 
ring first the philosophical aspects, for they will provide the motivation for the 
subject, this chapter follows L. E. J. Brouwer, the founding father of intuitionism. 
Although Brouwer himself contributed little to intuitionistic logic as seen from 
textbooks and papers, he did point the way for his successors.’ 

Logic in Brouwer's intuitionism takes a secondary place; the first is reserved for 
‘mathematics, which should be understood in the widest possible sense, as the con 
structional mental activity of the individual.’ The role of logic is to note and system: 
atically study certain regularities in the mathematical constructional process. Contrary 
to traditional views, logic is thus dependent on mathematics and not vice vers 

‘Mathematical practice shows that a relatively few logical connectives suffice for an 
efficient treatment of arguments. In the case of intuitionism, the meaning of the 
connectives has to be explained in terms of the basic mathematical notion: construc: 
tion. A fact A is established by means of a construction. An easy example is 3+2=5, 
Which is established by the following, construction: construct 3, construct 2 and 
compare the outcome with the result of the construction of 5. The outcome is a 
confirmation of the above equation. 

‘The construction criterion for ‘truth’ also yields an interpretation of the connect- 
ives." Write ‘a: A” for ‘a is a construction that establishes A”; then this a a called a 
proof of A. 

A proof of A A Bis simply a pair of proofs a and b of A and B. For convenience, 
here is a notation for the pairing of constructions, and for the inverses (projections) 
(a, 6) denotes the pairing of a and 6, and (cp, (@, are the first and second pro- 
jection of c. The proof of a disjunction Av Bis a pair (p,q) such chat p carries the 
information of which disjunct is correct, and q is the proof of it. Stipulate that 
PE(0, 1}, so p=0 and g:A or p=1 and g:B. Note that this disjunction is 
‘ffictive, in the sense that the disjunct is specified; this contrasts with classical logic, 
where one does not have to know which disjunct holds, 
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"Negation is also defined by means of proofs: “p +’ says that each proof # of A 
can be converted by the construction p into a proof of an absurdity, say 0= 1. A 
proof of A thus tells us that A has no proof! 

‘The most interesting propositional connective is implication. ‘The classical solu 
» A> Bis true if A is false or if Bis true, cannot be used because this uses 
classical disjunction; moreover, it assumes that the truth values of A and B are 
known before one can settle the status of A+ B. Heyting, however, showed that 
this is asking too much, Consider 








Ax ‘there occur twenty consecutive 7s in the decimal expansion of 
and 
B= ‘there occur nineteen consecutive 7s in the decimal expansion of x 


‘Then +A v B does not hold constructively, but the implication, A > B is obviously 
correct. 

‘The intuitionistic approach, based on the notion of proof, demands a definition of 
1 proof a of the implication A> B in terms of (possible) proofs of A and B. The 
idea is quite natural: A B is correct if one can show the correctness of B as soon 
as the correctness of A has been established. Thus: p: A—> B if p transforms each 
proof 4: A into a proof p(q) : B. The meaning of the quantifiers is specified along, 
the same lines. Assume a given domain D of mathematical objects. A proof p of 
\VscA(x) isa construction which yields for every object d€ Da proof p(d): Ald). 
A proof p of 3xA(x) is a pair (Ay. ~) such that p,: A(jy). Thus the proof of an 
‘existential statement requires an instance plus a proof of this instance. 

‘The full ist is given in Table 11.1. (Observe that an equivalent characterization of 
the disjunction can be given: a= (a,, #3), where a =0and a3 : A or a, =1 and a : B) 

‘This proof interpretation is now demonstrated for a few statements: 











1 A+(B+ A) 
‘An operation p is needed that turns a proof @ A into a proof of B+ A. But if 
there is already a proof a: A, then there is a simple transformation that tums a 


Table 11.1 





aA Conditions 





fae 
= (ay. m)o where a: A and a: B 

f= (aj, m,), where a: Aif a, =0 and a: Bit a= 
foe all 7 with 9: A, tp): 

= (a.m) and a: Ala) 

for all 4€ D, ad): A(d) where Dis given domain 
fo all pe A, a): 
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proof g: Binto a proof of A, ic., the constant mapping 4+ a, which is denoted 
by dg- a. And so the construction that takes # into Ag-a is da (2g- a), orin an 
abbreviated notation Aadg- a. Hence Aaig-a: A> (B— A). 
2 ASA 
A proof q of 7A is a proof of 5A 4. Assume p: A, and g:—A. Then 
4g 1A HOM A) rk Hence Aly g19): 9d. 
P:AV—A © (p)o=0 and (p),:A oF (plo=1 and (p), A. However, for an 
arbitrary proposition A itis not known whether A or ~A has a proof, and hence 
(pp cannot be computed. So, in general, there is no proof of Av +A. 
4 7BeAlx) 9 Ve Als) 
3eA(x) € pla): 4 for a proof a: 3xA(x). One has to find 4: ¥x+A(x), 
ie. qd): Ald) + 4 for any d € D. So pick an element d and let r: Ald), then 
(4, 1) :3¥Alx) and so p((d, 1): 1. Therefore pat (4(d)\r)= pl(dy 1)), $0 
= Arhd - p\(d, 7) and hence AparAd - p{(d, r)) :=3¥A(x) > VeAlx), 





Brouwer himself handled logic in an informal way, often showing the untenability 
‘of certain classical principles by a reduction to unproven statements, usually regarding 
the decimal expansion of x.* This technique, which goes by the name of ‘Brouwerian 
counterexample’, i illustrated with the following examples 

First, compute simultaneously the decimals of * and the members of a Cauchy 
sequence. Use Mk) as an abbreviation for ‘the decimals py a)... fy of Rare all 9." 
Now define: 


2y* if Ves nN) 
2y* fk mand N(A) 





4, is an oscillating sequence of negative powers of -2 until a sequence of 90 nines 
‘occurs in x, ftom then onwards the sequence is constant: 


L-hs Bees 204 COV Or, 


27%, 





‘The sequence determines a real number a, in the sense that it satisfies the Cauchy 
condition. The sequence is well-defined, and N(m) can be checked for each m (at 
least in principle). For this particular a, however, one cannot say that it is positive, 
negative or zero. 

44> 0.€2 N{A) holds for the first time for an even number 

<0. N{A) holds for the first time for an odd number 

a =0 ¢9 N(b) holds for no 


Since there is no effective information on the status of the occurence of N(E)s, one 
cannot affirm the trichotomy law; ic., VeE R(x<0vx=0vx>0), cannot be 
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said to have a proof. The above number a cannot be irrational, for then NUA) would 
never apply, and hence =0. Contradiction. Hence it has been shown (a is 
rational). But there is no proof that a is rational. So A A fails. One also easily 
sees that a=0-v a0 fils to have proof. 

Such Brouwerian counterexamples are weak in the sense that they show that some 
proposition has as yet no proof, but it is not excluded that eventually 2 proof may 
be found. In formal logic, there is a similar distinction between ¥ A and + +A. 
‘The Brouwerian counterexamples are similar to the first case, and strong counter- 
examples cannot always be expected. For example, although there are instances of 
the Principle of the Excluded Middle (PEM) where no proof has been provided, 
the negation cannot be proved. For ~(A v +A) is equivalent to A.A, which 
is contradiction! Some strong refutations of classical principles are given in later 
sections, 


11.2. Formalization of Intuitionistic Logic 


In 1928, the Dutch Mathematics Society posed in its traditional prize contest a 
problem asking for a formalization of Brouwer's logic. Heyting sent in an essay, 
with the motto ‘stones for bread,” in which he provided a formal system for 
intuitionistic predicate logic, The system was presented in ‘Hilbert style’ (as itis 
called now), i¢., a system with only two derivation rules and a large number of 
axioms. Heyting had patiently checked the system of Whitchead’s and Russell's 
(1910-13) Principia Mathematica, and isolated a set of intuitionistically acceptable 
axioms. His system was chosen in such a way that the addition of PEM yields full 
classical logic. Here i alist of axioms for IQC taken from Troelstra and van Dalen 
(1988, p. 68) (In this language negation, A, is defined as A—+ 4.) 

ANB A, ANBB 

A> (B>(ANB)) 

AGAVE B+AVB 

(440) 4 (B40) 4 (Av B40) 

A+B A) 

(A4(B C)) 9 (A+B) (A ©) 

13a 

A(t) + BACs) 

Va( A(x) > B) > GeAlx) > B) 

VeAlx) + Al) 

Vx(B Alx)) > (B> ¥xAlx)) 





227 





Dirk van Dalen 


(The quantifier axioms are subject to the usual conditions: ris free for xin A(x) and 
x does not occur free in B.) There are two derivation rules: the well-known modus 
onens (ic., -> E, see below) and the rule of generalization: 


Als) = EMAC) 


where +, stands for intuitionistic derivability. (When no confusion arises, +, will 
simply be writen as F.) The VF-rule (see below) would also have done just as well. 

Gentzen (1935) introduced two new kinds of formalization of logic. Both of 
these are eminently suited for the investigation of formal derivations as objects in 
their own right. The first is the system of Natural Deduction, which is described 
here. His second kind, the Sequent Calculus, is not discussed although it too has 
important proof-theoretical 

Tn the system of Natural Deduction, there are introduction and elimination rules 
for the logical connectives that reflect their meanings. The rules are given here in an 
abbreviated notation. 




















AB 
Wane 
AAB AAB 
ae AA 
Ain 8 
ive ave 
(4) (8) 
> 
Gy Aveoce 
c 
4) 
? 
B 
7 Ts 
A AaB 
ay tom 
ane: 
we 4 
> 
ve A) 
Ve Ais) 
Yeats) 
ve  Yedls) 
‘Alt) 
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Ate) 
1 
ane) 
Las) 
2 
aca) __C 


3E 
c 


There are a few conventions for the formulation of the natural deduction 
system. 


(i) A hypothesis of a derivation between square brackets is cancelled, that is to 
say, it no longer counts as a hypothesis. Usually all of the hypotheses are 
cancelled simultaneously. This is not really required, and it may even be be 
necessary to allow for “selective” cancellation (Troelstra and van Dalen, 1988, 
p. 559, and p. 568). For most practical purposes, however, itis a convenient 
convention, 
For the quantifier rules, there are some natural conditions on the free variables 
of the rule. In VJ, the variable x may not occur free in the hypotheses of . In 
VE and 31, the term ¢ must be free for x in A(x). Finally, in 3E, the variable x 
‘may not occur in C or in the remaining hypotheses of . For an explanation, 
see van Dalen (1997). 





(It is worth noting that strengthening the rule  E to the clasical absurdity rule 


54) 
D 





suffices for classical logic [see chapter 1]. This rue is equivalent to having axioms of 
the form 4. + A or a rule of double-negation elimination. It enables the classical 
principle of the excluded middle, PEM, Av A, to be proved.) 

‘The intuitionistic rules above are instructive for more than one reason. In the first 
place, they illustrate the idea of the proof interpretation described above. Recall that 
p: A B stands for ‘for every a: A (p(a) : B)’. Now —» E says that if one has a 
derivation of A and a derivation 1D’ of A+ B, then the combination, 


» 9 
AA>B 9 

aE job 

is a derivation of B. So there is an automatic procedure that, given 7, converts 

into a derivation 0” of B. 

‘The introduction rule also illustrates the transformation character of the 

‘implication 
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a 4) 
Let D be given, then —> I yields 2 
B 
a3 


‘The first derivation shows that by adding a proof ’ of A, one automatically gets a 
proof of B. So the niles say that there is a particular construction, converting proofs 
of A into proofs of B. This is exactly the justification for the derivation of A—> B. 
For the conjunction rules, the analogy is even more striking. Furthermore, as will be 
seen below, the correspondence can be made even more explicit in the Curry- 
Howard isomorphism. 

Both Martin-Lof (1984) and Dummett (1977) have argued that the introduction 
and elimination rules of a natural deduction system determine the meanings of the 
‘connectives by their use, For example, aJ says that if one knows (has evidence for) 
A and B, theo one also knows A.A B. So A specifies what one has to require for 
AN B. The elimination rule says what one may claim on the grounds of evidence for 
‘AA B. Ifone knows A. B then one also knows A and likewise B. These rules have 
been chosen so that they are ‘in harmony.” The evidence required in AJ's exactly the 
evidence one can derive in aE. Dummett (1975) used this feature of the rules to 
support his claim that intuitionistic logic fits the requirement that ‘meaning is use,’ 
insisting that mathematical knowledge be demonstrable. “The grasp of the meaning 
of a mathematical statement must, in general, consist of a capacity to use that 
statement in a certain way, oF to respond in a certain way to its use by others.” As 
4 consequence the traditional, Platonistic, notion of truth has to be replaced by 
something more palpable; the notion of proof is exactly what will fill the need for 
communicability and observabilty. Hence the slogan “a grasp of the meaning of a 
statement consists in a capacity to recognize a proof of it when one is presented to 
us.” This, of cours, is in complete accord with intuitionistic practice. The rejection 
Of the Platonistic notion of truth is indeed an aspect of Dummett’s anti-realism, 
Brouwer had always denied the realistic thesis, that there is an outer world inde: 
pendent of us. 

‘As noted above, while Hilbert designed his proof theory for the purpose of 
consistency proofs, Gentzen considered the structure of derivations themselves to be 
objects for study, and, by means of ingenious proof theoretic techniques, a number 
Of striking intuitionistic features can be shown, e-g., effective versions of the disjunc: 
tion and existence properties for a number of theories. (These properties are di 
‘cussed in section 11.3.) Moreover, these proof-theoretic methods have the advantage 
‘over the semantical approach to be discussed in section 11.3 in that these results are 
directly constructive. 

Since natural deduction is so close in nature to the proof interpretation, it is 
pethaps not surprising that a formal correspondence between a term calculus and. 
natural deduction can be established. This will be demonstrated for a smal frag- 
ment, containing only the connective —+. Consider an — introduction: 
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be] 
> 

EB 

As WenAae 


wos 





‘Assign in a systematic way 2-terms to formulas in the derivation. Since A is an 
assumption, it has a hypothetical proof term, say x. On discharging the hypotheses; 
introduce a 2x in front of the (given) term ¢ for B. By binding x, the proof term for 


A> Brno longer depends on the hypothetical proof x of A, Similarly, the elimina- 
tion runs as follows: 


ASB_A BASE 
B WoyB 





Observe the analogy to the proof interpretation. Consider a particular derivation, 


1A) fx:A} 
BoA yx BoA 


A> B>A) dys A DBA) 


‘Thus the proof term of A+ (B-» A) is Aay-x, which is the Curry combinator K, 
Note that the informal argument of pages 225-6 is faithfully reflected. 
Now consider a cut elimination conversion: 











«B 
D v 
BA 2 reduces to B 
detBoA 5 s/s) 
Gs We/a):4 


‘The proof theoretic conversion thus corresponds to the Brreduction of the A- 
calculus, 

‘To deal with full predicate logic, specific operations need to be introduced to 
render the meaning of the connectives and their derivation rules. Here isa lst: 





p - 
projections 


[PoP 
[D discriminator (case dependency”) 
J case obliteration 


E Witness extractor 


Lex fal operator 
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Hlosti): Ae 9 Ay 





















AE Ag a A, (FE (0,11) 
vi 
ve EAN tlet:C she C 
Dade tale TDC 
a 
Ay* ely )s, 
a. 
SE eB 
te 
+E Twa 
fs}: Acs) 
we ——aa 
rab -WsA) 
WE 
ea 
aE Be Ate) gly 30 :C 


Bolts Ale 








‘There are a number of details to mention: 


(i) In J, the dependency on the hypothesis has to be made explicit in the term, 
‘This is done by assigning to each hypothesis its own variable, ¢.g., x4: A, 

(i) In VE (and similarly 3B) the dependency on the particular (ausiliary) hypoth- 
exes A and B disappears. This is done by a variable binding technique. In vE 
and 3E, D,, and E,, bind the variables w and ». 

(iii) In the falsum rule, the result, of course, depends on the conclusion A. So A 
has its own ex falso operator 1.» 


Now the conversion rules for the derivation automatically suggest the conversion for 
the term. 

Given the correspondence between the term calculus and the natural deduction 
system, one may see a correspondence between proofs and propositions on the one 
hand and clements (given by the terms) and types (the spaces where these terms 
are to be found). This was first observed for the implication fragment by Curry 
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(Curry and Feys, 1958, ch. 9, section E), and extended to full intuitionistic logic 
bby Howard (1980). Here is a simple case, the fragment considered by Curry. 

Since the meaning of a proposition is expressed in terms of possible proofs - one 
knows the meaning of A if one knows what things qualify as its proofs — one may 
take an abstract view and consider a proposition as its collection of proofs. From this 
viewpoint, there is a striking analogy between propositions and sets. A set has 
clements, and a proposition has proofs. As seen, proofs are actually a special kind of 
construction, and they operate on cach other. For example, if there is a proof 
p: A— Band a proof q: A then q): B. So proofs are naturally typed objects. 

Similarly, ome may consider sets as being typed in a specific way. If A and B are 
typed sets then the set ofall mappings from A to Bis of a higher type, denoted by 
A~ Bor B*. Starting from certain basic sets with types, one can construct higher 
types by iterating this “function space’-operation, Denote ‘ais in type A by a A. 
‘Then there is the striking paralle! shown in Table 11.2. 








Table 11.2 
Propositions pes 

aa aca 

PADRE ADP: B PEAT GEA PVER 
Aso i(s): Bthen det: A+B Aco tix) € Bthen Ae tA B 





It now is a matter of finding the right types corresponding to the remaining 
connectives. For » and v, a product type and a disjoint sum type are introduced. For 
the quantifiers, generalizations are available. The reader is referred to the literature: 
Gallier (1995) and Howard (1980). 

‘The main aspect of the Cury-Howard isomorphism (also known as ‘formulas ~ 
‘oF propositions ~ as types’) is the faithfil correspondence: 


—frool_,, emeon 
Propositions ~ types 





with their conversion and normalization properties. [See chapter 13, section 13.8, 
for some related discussion.) 

‘Martin-Lof was the firs logician to see the fill importance of the connection 
between intuitionistic logic and type theory. Indeed, in his approach, the two are $0 
closely interwoven that they actually merge into one master system. His type systems 
are no mere technical innovations, but they intend to catch the foundational mean- 
ing of intuitionistic logic and the corresponding mathematical universe. Expositions 
can be found in, for example, Martin-Lof (1975, 1984). 

Constable and his collaborators have based a proof checking system on natural 
deduction and Martin-Lof’s type theory. It is a tool for computer assisted develop- 
iment of proofs in intuitionistic systems, and it provides proof terms for provable 
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sentences. An carly exposition may be found in Constable et al. (1986); the reader 
should consult the modern literature for updated versions. There are a number of 
proof checking systems available, e.g. the system Cog of Coquand, 


11.3. Semantics 


‘The study of interpretations is called semantics; from this viewpoint, formulas denote 
certain things. Frege already pointed out that propositions denote truth values, 
namely “true” and ‘false’, conveniently denoted by 1 and 0, and the interpretations 
of logic under this two-valued semantics is handled by the well-known truth-tables 
(van Dalen, 1997) [see chapter 1]. 

This method has a serious drawback: all propositions are supposed to be true or 
false, and PEM automatically holds (though some might pethaps see this as an 
advantage rather than a drawback). It certainly is too much of a good thing for 
intuitionistic logic. By our choice of axioms (or rues), intuitionistic logic is a subsystem 
of classical logic, so the two-valued semantics obliterates the distinction between the 
‘two logics: too many propositions become truc! 

‘One might say that there is no need to worry about the problem of semantics, 
since one already has the intended proof interpretation. Although this certainly is 
the ease, the proof interpretation is not specific enough to yield sharp decisive results, 
as one would like in model theory. One would need more assumptions about 
‘construction’, before technical problems can be settled. 

All of the formal semantics discussed below are strongly complete for intuitionistic 
predicate and propositional logic, in the sense that 


ThAereA 


where F is the semantical consequence relation in the particular semantics 


11.3.1. The topolagical interpretation 


In the mid-1930s, a number of systematic semantics were introduced that promised 
to do for intuitionistic logic what the ordinary truth-tables did for classical logic. 
Heyting had already introduced many-valued truth-tables in his formalization 
papers, €.g., to establish non-definability of the connectives. Then Jaskowski (1936) 
presented 2 truth-table family that characterized intuitionistic propositional logic. 
‘Godel (1932) had dispelled the expectation that intuitionistic logic was the logic of 
some specific finite truth value system. 

‘An elegant interpretation was introduced by Tarski (1938). It was, in fact, a 
‘generalization of the Boolean valued interpretation of classical logic. Ever since 
Boole, it was known that the laws of that logic correspond exactly to those of 
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Boolean algebra (think of the powerset of a given set with A, U and “as operations, 
corresponding to v, ~ and —). Now one wants a similar algebra with the property 
that U"#U (for +A and A are not equivalent). By Brouwer's theorem 
(-55A 6-74), one expects U~ = U". The remaining laws of logic demand that 


(Un v)"= USA ve 
and 
(uu yyc uu 


‘This suggests that the operator “ behaves as a closure operator in topology. It 
turns out that it is a good choice to let the open sets in a topological space X 
play the role of arbitrary sets in the power set of X. So the family O(X) plays the 
role of 9(X). 

Here is the notation: [A] is the open set of X assigned to A. The valuation 
[1: PROP 0(X) is defined inductively forall propositions; let [] be given for all 
atomic A, where [1] = @, then 


fan a-alnia 
lav He LAU 
14> B= ney UE 
[l= Incl 
Here Int(K) is the interior of the set K (ic., the largest open subset of K). Note 
that this looks very much like the traditional Venn diagrams, with the extra require- 


‘ment that negation is interpreted by the interior of the complement. This is neces- 
sary if one wants open sets all the way. A simple calculation shows that 


hAsfAl=X 


Since X is the largest open set in O(X), it is plausible to call A true in O(X) if 
[I= x. 

‘This interpretation can be used to show underivability of propositions. Con- 
sider, for example, 0(R), the open sets of real numbers. Assign to the atom A the 
set R= (0). Then 


L41=R- 10) 
Al= Inx((0)) 
b-4I-R 
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[A Ale Inth-APU A 
=OUA=AsR 


Hence YA > A. Similarly #,4v +A, for [Av A] = A#R. 
‘The topological interpretation can be extended to predicate logic. Let a domain D 
bbe given, then 


Beats] = UIA) dE DI 
[Vx A(x)] = Ine(MLA(a)] | dE DI) 


‘The topological interpretation is indeed complete for intuitionistic logic. Suppose 
that A is true in O(X) if [A] = X for all assignments of open sets to atoms. A is true 
if A is true under all topological interpretations. Completeness can now be formu 
lated as usual: +, A> A is true. Classical logic appears as a special case when one 
provides a set X with the trivial topology: O(X) = p(X).” 

‘The algebra of open subsets of a topological space isa special case of a Heyting 
‘algebra, which is defined, much like Boolean algebras, by axioms for the various 
‘operations. It has binary operations A, v, —», a unary operation ~ and two constants, 
0, 1. The laws are: 





anb=bna 
avbcbve 

aa (brd=(anrdyac 

av(bvee(avbyve 

an(bvd=(and)y(ane) 

av (bnd=(avbalave 


laaea 


lvest 





avant 
an(as Baan 
bala b= 


a3 (bad= (a8 A(ao) 
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Since it is necessary for the interpretation of predicate logic to allow infima and 
suprema of collections of elements, one often considers complete Heyting algebras, 
algebras with the property that, for a collection [a4 € 1} of elements, there is 
2 supremum \/iz , (the unique least element majorizing all «,), and an infimum 
ar te For a complete Heyting algebra, the laws can be simplified somewhat. Adopt 
the standard axioms for a lattice and add 





aa VS=Vians|s€ 5} 


11.3.2, Beth-Kripke semantics 


Elegant as the topological interpretation may be, it is not as flexible as two later 
interpretations introduced by Beth and Kripke, both of which have excellent heu- 
ristics. This section considers a hybrid semantics, Beth-Kripke models, introduced in 
vvan Dalen (1984), 

‘The basic idea is to mimic the mental activity of Brouwer’s individual, who creates 
all of mathematics by himself. This idealized mathematician, also called the creating 
subject by Brouwer, is involved in the construction of mathematical objects, and in 
the construction of proofs of statements, This process takes place in time. So, at 
‘each moment, he may create new clements, and, at the same time, he observes 
the basic facts that hold for his universe so far. In passing from one moment in 
time to the next, he is free to choose how to continue his activity, so the picture 
Of his possible activity looks like a partially ordered set (even like a tree), At each 
moment, there is a number of possible next stages. These stages have become 
known as pasible worlds. 

Consider, for the moment, the first-order case; that is to say, consider elements 
‘of one and the same sort, and a finite number of relations and functions (as in a 
standard first-order language). The stages for the individual form a partially ordered 
set (K, 5). View k= € a8 ‘k is before € oF coincides with €.” For each k K, there 
{sa local domain of elements created so far, denoted by D(4). It is reasonable to 
assume that no elements are destroyed later, 40 & € => D(z)  DXE). 

‘A path in the poset K is a maximal ordered subset. For a node & in K, a bar B is 
4 subset with the property that every path through # intersects B. Now, to stipulate 
hhow the individual arrives at the atomic facts: He does not necessarily establish an 
atomic fact A ‘at the spot,’ but he will, no matter how he pursues his research, 
‘establish A eventually. This means that there is a bar B for & such that, forall nodes 
€ in B, the statement A holds at €. Note that the individual does not (have to) 
observe composite states of affairs. The next step is to interpret the connectives 
(Table 11.3). Write “kt A? for ‘A holds at &” The technical terminology is % forces 
A” For atomic A, ki A is already given, and 41 is never forced. 

Observe that the ‘truth’ at a node & depends essentially on the future. This is an 
important feature in intuitionism (and in constructive mathematics, in general). The 
dynamic character of the universe demands that the future is taken into account, 
‘This is particularly clear for V. If we claim that 


All dogs are friendly. 
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Table 113 
ERARB | Bh Aand AEB 
BRAVB BYE OE Aor (iB 
ERASE WORKER AS EDB) 
kena beAsL 

Vez Kor ACh 1) 

vez Kew A) 
Eb 3eAx) SBE B3a € D(E) C1 Ala) 
Eb Veds) — WEm YA DiC) Ob Ala) 


then one unfriendly dog in the future may destroy the claim. 

‘An individual model over a poset K is denoted by We say that ‘A is true in a 
Reth-Kripke model -X' if for all #E K, kik A. “A is true’ if A is true in all Beth~ 
Kripke models, “Semantical consequence” is defined as + A iff (if and only if) for 
all Beth-Kripke models and all # K, él C for all CET = kik A. 

A Beth-Kripke model with the property that che bar B in all the defining, clauses 
is precisely the node & itself, is called a Kripke model. And if all the local domains 
‘D(k) are identical, we have a Beth model. Beth-Kripke-, Kripke- and Beth-semantics 
are all strongly complete for intuitionistic logic: TA 2 Tt A (special case: +, A € A 
is true), 

‘There are a number of simple properties that can easily be shown, €.g., forcing is 
monotone, ic, kit A, k= €=9 € A, and in Beth-Kripke and Beth models 


kik Aco IBVEEB CHA 


Similarly Kripke models easily demonstrate examples like these: (Note, fist, that 
kh A if VEz IW A), ic., A is not forced after & bY A iff for some € = k, 
C1 A, So kik 5A if for each € = &, there is an m2 J with mi A.) 


(a) Consider an atomic A and let hy > , by A, but yiF A. By the above remark, 
note that fyltA. Hence YA A. Furthermore &l¥ +A, $0 
bglt AVA. 

(b) by > fy > hy yt Aad y+ B. Note that by if A, by lB, $0 by AV 3B. 
WAN B, by An B, 30 kt (AB). Hence byl 4A.» B)—» (A vB) 
(De Morgan’s law fail) 

(e) b> hy he A(O), by I¥ AO), b, ACL). Clearly b # WALA), $0 byt WCA(X). 
If bl 3x-vA(a) then kj A(1). But this contradicts &; I A(1). Hence 
Igl¥ 3xA(x). So byl WxAlx) + 3x Alx). 


‘There is an extensive model theory for Kripke semantics. In particular, there are a 
large number of results on the structure of the partially ordered sets. For example, 
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intuitionistic predicate logic, IQC, is complete for Kripke models over trees, and, 
for propositional logic, this can even be strengthened to completeness over finite 
trees (the so-called finite model property. if ¥ A, then A is false in a Kripke model 
over a finite tree). 

‘The completeness over tree models can be used to prove the disjunction property 


(DP) AVB=hA or 1B 


‘The proof uses reductio ad absurdum. Suppose ¥,A and ¥, B, then there is a tree 
model X, where A is not forced, similarly a tree model X; where B is not forced. 
‘The two models are ghued together as follows: put the two models side by side and 
place a new node # below both. In this new node, no proposition is forced. The 
result is a correct Kripke model. Since t, A v B, then kik Av B, and hence kik A or 
it B. Bur that contradicts the fact that A and B are not forced in X, and 
‘Therefore t, A or +, B 

‘There isa corresponding theorem for existential sentences, the existence property 


(EP) 38 A(x) =, A(#) for a closed term # 


‘These theorems lend a pleasing support to the intuitionistic intended meaning of 
‘existence’: if you have established 2 disjunction, you know that you can establish 
‘one of the disjuncts. Similarly for existence: if you have shown the existence of 
something, you can indeed point to a specific instance, 

‘As shall be seen, the disjunction and existence property hold for a number of 
prominent theories, the most important being arithmetic. DP and EP are often 
considered the hallmark of constructive logic; one should, however, not overesti- 
mate the significance of a technical result like this. 

‘There is a snag in these conclusions. As the reader has seen, the proof made use 
of reductio ad absurdum, ic., the result has not been established constructively. 
‘What one would like is a method that extracts from a proof of 3xA(x) a proof of 
A(d) for some d. Fortunately, there are proof theoretical devices that provide exactly 
this kind of information; sce, for example, van Dalen (1997, p. 211). Smorynski 
(1982) has shown that in a fairly large number of cases ‘semantic’ proofs can be 
made constructive. 

Finite Beth models are not interesting for intuitionistic logic. For in each leaf 
(maximal node) A or -.A holds, so in each node Av—A holds, ie., the logic is 
classical. Beth’s original models (Beth, 1956) are slightly more special; he consid- 
ered constant domains, c., D,= D, forall , C€ K. This isa certain drawback when 
compared to Kripke models. The combinatorics of Beth models is just more com: 
plicated than that of Kripke models, although Kripke (1965) showed that Kripke 
and Beth models can be converted into each other. 

Beth models are also easly convertible into topological models. For convenience, 
Beth models over trees are considered. On a tree, one can define a topology as 
follows: open sets are those subsets U of the tree that are closed under successors, 
ie., KE U and k= €=9 €€ U. It is easily seen that these sets form a topology. 
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‘One also sees that {| kit Al is an open set, therefore suppose [A]={&] kit A}. 
Now it isa matter of routine to show that the function fis a valuation as defined 
in subsection 11.3.1. 

In addition, Beth semantics tum out to be a convenient tool in completeness proofs, 
see Troelstra and van Dalen (1988, ch. 13), and Dummett (1977). In particular, it 
is useful for rendering completeness proofs in an intuitionistic metamathematics. 
‘Veldman (1976) was the frst to consider a modified Kripke semantics, for which he 
could give an intuitionistically correct completeness proof, Since then De Swart, 
Friedman, Dummett, and Troclstra have given alternative versions for Beth semantics 
(Troelstra and van Dalen, 1988, section 13.2). Furthermore, Beth models happen 
to be better adapted to second-order arithmetic with function variables, the s0- 
called intuitionistic analysis. They allow for a very natural interpretations of choice 
sequences (van Dalen, 1978, 1984). 


11.3.3. Super semantics 


‘This chapter has presented a number of interpretations of intuitionistic theories, but 
by no means all of them. There is, for example, a totally different interpretation of 
intuitionistic logic (and arithmetic) in Kleene’s realizability interpretation. This is 
based on algorithms, and, on the face of it, one could not find a similarity with the 
above semantics. One might wonder if these semantics are totally unrelated, or 
‘whether there is some common ground. The obvious common ground is the logic 
thar they are modeling, but that would not be sufficient to link, say, Kripke models 
and realizability. 

Fortunately, there is a general kind of semantics based on category theory." His 
torically, these newer interpretations grew out of existing semantics, ¢.g., Scott 
(1968, 1970) showed how one could capture strikingly intuitionistic features in his 
‘extension of the topological interpretation to second-order systems. This section 
looks at Scott’s original model, adapted to the so-called sheaf semantics 

Consider the topology of the real line, R, and take the open sets to be the truth 
values to interpret the intuitionistic theory of reals. The objects are partial continuous 
real-valued functions with an open domain. This domain ‘measures’ the existence of 
the object: Ea € O{R). Thus this interpretation forces one to consider the notion of 
partial object and the notions of ‘equivalence’ and ‘strict equality.” All objects are 
partial, and a is said to be total if Ea=R. Equality has to be reconsidered in this 
light: 








[a= d]= Inte € RY a(2)= 09) 


Observe that [a = a] = Ea, and in general [a= 6]. ¢ Ea) 0 E(b). 
In addition to the notion of equality, there is a convenient notion of Me: 
‘a and & coincide where they exist,’ written in symbols as 





a~ b= Fay Bb a=b 
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‘According to the above definition, [4 = a]=R. The operations on the partial ele: 
‘ments are defined pointwise: 


(a4 Wt) = als) + He) 
(a- bye) = a(8)- BE) 


‘The definition of the inverse is, however, problematic. One can define it pointwise 
for values distinct from 0, but then there seem to be problems with the invertbilty 
of non-zero elements. The fact is that ‘non-zero’ is not good enough to have an 
inverse, Let a(#)= 4, then 


[a= 0} = Inte} (2) = 4=0} = Ine(0] = 


faso) 





ja =O] = Int]a=0F = 
‘This a is non-zero in the model. There is, however, no & such that fa. b= 1} =R. 
‘This is where apartnesr comes in, It is well-known that a real number has an inverse 
iff iris apart from 0. So, interpret # in the model: 
Laedp= 2] als) 2 9) 

‘The condition for an inverse now becomes: 

a > 3K a b= 1) 
‘The model now has an inverse for a. Take 6(#) = for £0, then 


[a#0] = - (0) = Uda c= 1 Cha. b= =R- (0) 





‘The introduction of an existence predicate, and partial equality, of course, carries 
obligations for our quantifers. In particular 


BeA(x) + Be Een Als) 
and 
VAs) + Vx( Ex A(x) 


‘The logical axioms, or rules, thus have to be revised. It suffices 10 consider the 
‘quantifier rules and the equality rules: 
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D 


As) 
Vals) 
WsAlx) Et 
Ate) 
At) Ete) 
BxAlx) 


[Als [Ex] 





‘The axiomatization of the equality aspects is simple if the equivalence relation is 
‘used instead of identity itself. Add the axioms: 





wm yn Als) + AQ) 


elem sery=s)oxmy 
‘That is, take ~ as a primitive, and = is then regained by defining” 
asm En Bsatms 


Scott (1970) also extended his model to a higher-order theory of reals, in which 
he could show that Brouwer’s continuity theorem holds, while Moschovakis (1978) 
applied the methods of Scott’s semantics to intuitionistic analysis in the style of 
Kicene’s FIM. Her objects were continuous mappings from Baire space to Baire 
space (a model with total elements). Van Dalen’s (1978) model for analysis is based 
‘on Beth’s original semantics. It can be translated straightforwardly into a topolog- 
{cal model, but its formulation in a Beth model allows for some extra fine tuning so 
that techniques from set theory can be applied. For example, a model for lawless 
sequences is constructed by means of forcing. The mode! also interprets the theory 
of the creating subject. 

‘Sheaf models, as examples of a topos, allow a natural interpretation of higher-order 
logic. Van der Hoeven and Mocrdik (1984) used a special sheaf model to interpret the 
theory of lawless sequences. Moreover, since Kripke models and Beth models are built 
‘over trees, they carry a natural topology, and so they too can be viewed as sheaf models. 

‘The next generalization after sheaf models is that of categorical model. Certain 
categories are powerful enough to interpret higher-order intuitionistic logic. In that 
sense, topos models can be viewed as intuitionistic universes. Categorical models 
also tumed out ro be important for typed lambda calculus. (The reader is referred to 
the literature for details.) It should also be mentioned that the topos semantics has 
managed to capture most of the known semantics. For example, Hyland (1982) 
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showed that the realizability interpretation fitted into his effective topos; see also 
‘Troclstra and van Dalen (1988, ch. 13 section 8). 


114. First-Order Logic 


While there is no special foundational bias towards first-order theories, it is an 
observed fact that large parts of mathematics lend themselves to formulation in a 
first-order language. This section looks at a number of first-order theories and points 
‘out some of their salient properties, 

Intuitionistic predicate logic itself is, just like its classical counterpart, undecidable. 
Fragments are, however, decidable. For example, the class of prenex formolas is 
decidable, and, as a corollary, not every formula has a prenex normal form in 1QC 
(Kreisel, 1958). On the other hand, although monadic predicate logic is decidable 
in classical logic, Kripke (1968) showed it is undecidable in intuitionistic logic; see 
also Orevkov et al. (1965). Likewise, Liftchitz (1967) showed that the theory of 
equality is undecidable. 

In propositional logic, one may impose restrictions on the underlying trees of 
Kripke models; in predicate logic, one may also put restrictions on the domains of 
the models. A’ well-known instance is the constant domain theory of Goernemann 
(1971) who showed that predicate logic plus the axiom 





Vx(A(x) vB) > Wels) v B 


(with 5 not free in B) is complete for Kripke models with constant domains, 

In general, Kripke semantics is a powerful tool for meta-mathematical purposes; 
its usefulness has already been seen in subsection 11.3.2 in the case of the disjune- 
tion property. The key method in the proof was the joining of a number of disjoint 
kripke models by placing one extra nade (the root of the tree) below the models. 
‘This technique has become known as gluing. Following Smorynsk, it will be applied 
to a few theories. (First, however, note that, although a number of intuitionistic 
first-order theories share their axioms with their classical counterparts, in general, 
theories are sensitive to the formulation of the axioms, notably in the absence of 
PEM.) 

The theory of equality has the usual axioms: reflexivity, symmetry, transitivity, 
‘One can strengthen this theory in many ways, for example, the theory of stable 
‘equality is given by 


BQ =EQ+ ¥ax(x=94x=9) 

‘And the decidable theory of equality is axiomatized by 
EQ™ =BQ+¥anis=yv x49) 

where s# y abbreviates x= 
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In intuitionistic mathematics, there is aso a strong notion of inequality: apartness, 
4, as mentioned above. This was introduced by Brouwer (1919) and axiomatized by 
Heyting (1925). The axioms of AP are given by EQ and the following list 


VWayy’y (xy a x=’ ay=y'>3'ty’) 
Wextatty > yx) 

VWay(axty x= 9) 

VWays(sxty > xtsv ys) 


‘The ghiing technique will now be used to show that AP has the disjunction and 
existence properties. Let AP Av Band assume AP ¥ A and AP ¥ B. Then, by the 
strong completeness theorem, there are models X, and X, of AP such that 2X; # A 
and X;l# B. Consider the disjoint union of %, and X, and place the one-point 
world fy below it, That is to say, designate points « and in X, and X; which are 
identified with the point fy. The new model obviously satisfies the axioms of AP, 
Hence &y it Av B and so it A or ky' B. Both are impossible on the grounds of 
the choice of ; and X;- Contradiction. Hence AP + A or APF B. 

For EP, it is convenient to assume that the theory has a number of constants, say 
[6] 4€ 1). Now let AP + 3xA(x) and AP ¥ A(«) for all Then, for each é, there is 
a model X, with X,!¥ A(@). As above, the models %, can be glued by means of a 
bottom world X* with a domain consisting of just the elements ¢,-.No non-trivial 
atoms are forced in h (ic, only the trivial identities «,= «). The identification of the 
«, with elements in the models is obvious. Again, itis easy to check that the new 
‘model satistics AP. Hence yt 3eA(x), Le., bit A(@) for some #, But then also 
X, lt A(6), contradiction. Hence AP + A(@) for some i. 

‘The gluing operation thus demonstrates that there are interesting operations in 
Kripke model theory that make no sense in traditional model theory. 

‘The apartness axioms have consequences for the equality relations. In particular 
stable equality is obtained: 





weweyeny 
For, xt x= 7, 90 
Sey aaaatye waxey 


Indeed, the equality fragment itself is axiomatized by an infinite set of quasi- 
stability axioms. Put 


x@yearey 


we ys Valea, ev 28,9) 
For these ‘approximations to apartness’, formulate quasi-stability axioms: 


S=Val(ax8.y> x=7) 





Intuitionistic Logic 


‘The S, axiomatize the equality fragment of AP. To be precise: AP is conservative 
over EQ + (S,| "= 0}."° This shows that even a relatively simple theory like equality 
is incomparably richer than the classical theory. 

‘Apartness and linear order are closely connected. The theory LO of linear order 
has axioms: 


Vagel(x<yay<eox<s) 
Vaya(x< yo s<yvx<s) 
Vane (ee yo ae < pany <2) 


‘The second axiom is worth noting, because it tells, so to speak, that a < means 
that « is ‘far’ to the left of b, in the sense that if an arbitrary third point is chosen, 
it has to be to the right of a or to the left of b. The relation with apartness is given 
by 


LO+ APE x<yvy<xerxy 


One can abo use x<yv¥<.x to define apartness. In a way, this gives the best 
possible result since LO + AP is conservative over LO (van Dalen, 1997; Smorynski, 
1973b).. 

cis important to note that the atoms, or, in general, the quantifier free part of 
4 theory does not yet determine whether it is clasical. There are cases where a 
decidable equality results in the fact that the theory is classical, ¢.g., the theory of 
algebraically closed fields; on the other hand, for arithmetic one can prove that 


Vay(xeyy vey) 
but the theory is not classical. 

‘The best known first-order theory is, of course, arithmetic, Intuitionistic arithme- 
tic, HA, named after Heyting (1930a, b), is axtomatized by exactly the same axioms 
as Peano’s arithmetic. The difference is the underlying logic: 

PA=HA+PEM 

‘The ghuing technique also works for HA although, in this case, one has to do 
some extra work to check the induction axiom. The result is that for HA one has 
the existence property for numerals 

HA} 3xA(x) => HAF Al») for some 


‘And, while the disjunction property is an obvious consequence of the existence 
property, as v is definable in terms of 3: 


HA} (Ay B) 9 3x((x= 0 A) a (x20 8) 
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i would seem rather unlikely thar the existence property is a consequence of the 
disjunction property. Yet, to the surprise of the insiders, Friedman (1975) proved 
that this is indeed the case for HA and a number of related systems. 

Intuitionistic arithmetic is, of course, even more incomplete than classical arith- 
metic because itis a subsystem of PA [sce chapter 4]. In fact PA is an unbounded 
extension of HA. 

‘There are a number of interesting extensions of intuitionistic arithmetic formed 
by adding principles that have a certain constructive motivation. One such is Markov's 
Principle 

(MP) Wx(Aix) v7ALx)) 438 Ala) > BeAlx) 


‘This principle is a generalization of the original formulation of Markov (1971 [1962}), 
who considered the halting of a Turing machine. Such a machine is an abstract 
computing device that operates on a potentially infinite tape. The key question for 
‘Turing machines is: Does the machine, when presented with an input on the tape, 
‘eventually halt (and hence produce an output)? Suppose now that someone says that 
it is impossible that the machine never halts, does one know that it indeed halts? 
Markov argued: “Yes.” The decision procedure for the halting in this case consists of 
turing on the machine and waiting for it to halt. An intuitionist would not buy the 
argument, however. When somebody claims that a Turing machine will stop, the 
natural question is: “When?™ One wants an actual bound on the computation time, 
Reading ‘the machine halts at time x" for A(x), the above formulation exactly covers 
Markov’s argument. In fact, in Markov’s case A(x) is primitive recursive, 

‘A simple Kripke model shows that HA ¥ MP. Consider a model with two nodes 
fy iy where fy <j. In the bottom node, put the standard model of natural 
numbers; in the top node, put a (classical) non-standard model in which the 
negation of Géde!’s sentence 


1 am not provable in PA. 


i.e. the proper sentence of the form 3xA(s), is true. So ky IK 3xA(x) and hence 
kyl s3xA(x), But bt 3xA(x) would ask for an instance A(i) to be true in the 
standard model, and hence would yield a conflict with the independence of the Godel 
sentence from PA. Since, clearly, ky Wx(A(x) v A(x), the model refttes MP. 

Although Markov’s principle may be unprovable in FIA, its companion, Markor’s 
rule, has a stronger position. Given a statement of the form A> B, one may 
formulate a corresponding, rule 


ThASEeB 
which is, in general, weaker than 
TASB 
For example: Markov's rule MR with respect to HA says: 
HA? Wx( A(x) v 7A(x) 4 3x A(x) => HAE 3 A(x) 
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HA is said to be cased under Markor’s rule. The heuristic argument is that there is 
‘more information in this case; there is a proof of +3 A(x), from this extra evidence 
cone may hope to draw a stronger conclusion. Marko's rule is indeed correct; see, 
for example Troelstra and van Dalen (1988, p. 129, and p. 507). Note that A(x) 
may contain more free variables. For closed 3xA(s), the proof of closure under 
Markov’s nile is particulary simple: if HA} +4¢A(x), then PA} 3xA(x}, and 
hence A(n) is true in the standard model for some n. Now HAt A(n) vA(n) 
and, by DP, HA + A(n) or HA + +A(n). The later is impossible, hence HA + Ain), 
and 4 fortiori HA + 3x.A(x) (Smorynski, 19733, p. 366). 

Since the theory of natural numbers is at the very heart of mathematics, it is no 
surprise that a great deal of research has been devoted to the subject. In the early 
days of intuitionism, people wondered to what extent HA was safer that PA. This 
‘was settled in 1933 by Godel (and independently by Gentzen), who showed that 
PA can be translated into HA. Godel defined a translation, which from the intui- 
tionistic viewpoint weakened statements. This was basically done by a judicious disti- 
bution of negations. Here is the formal definition: 


Av = 54, for atomic A 
(An BY = AaB 
(Av Bf 245A" AB) 
(As BY = AB 
(¥xAlx))? = VAX) 
(Beda) = a¥e54%(s) 
‘This Godel translation gives 
PA‘, Aes HA‘, A° 
and hence 


PAL O=1¢>HAt 0-1 6>HAtO=1 


So PA is consistent iff HA is 40. In other words, no deep philosophical insight can 
be expected here. 

It is an easy consequence of the Godel translation theorem that the universal 
fragment of PA is conservative over HA. Of course, the Godel translation also 
works for predicate logic: 


ThAgrh a 


where F° =[B°| BET] (van Dalen, 1997, p. 164). 





247 


Dirk van Dalen 





‘The result of the Godel translation may be improved for certain simple formulas. 
Kreisel showed that PA and HA prove the same IT} sentences, 


PAL, Wa3yAlx, 9) 69 HA, Ve3sAlx, 9) 


where A(x, y) is quantifier free. 

‘The proof of this has some quite interesting features. First, note that a quantifier 
free formula A(x, y) is equivalent in HA to an equation (x, »)=0 (where it is 
assumed that HIA has defining equations for the primitive recursive functions). 
Next, introduce the Friedman translation: for a given formula F, obtain A' from A 
by replacing all atoms P by Pv F. It is a routine exercise to show that 


() ThAst}at 
i) AnAr 

(iii) HA} A= HA‘, AP 
Now consider a term ¢ and let 


HAF + 3e(s(x, 9) =0) 





We apply the Friedman translation with respect to F= 3x(#(x, y) = 0). 


((Ax( (x, 9) = 0) > L) > 4) = (Bx (tx, 9) = 0) v Bx(H(x, 9) = 0) > 
(4 v Bx(#(x, 9) = 0) 9 (4 vAx(e(x, 9) = 0))) 


‘This formula is equivalent to 3x( x(x, y) =0). Hence HA, 3x(t(x, 9) =0). 
‘Observe that one now has closure under Markov's rule with numerical parameters 

It is just one step further to reach Kreise!’s theorem. From (formalized) recursion 

theory, itis known that a function f is provably recursive in HHA with index ¢ if 


HAL Vs39T(e, x 9) 


where Tis Kleene's T:predicate, which formalizes the notion y is a (halting) com- 
putation on input x (strictly speaking ‘y is the code of a computation . .."). Now 
PAL Wx3pT(«, x, y) 2 PAP 39T\e, 9) 
© (Godel translation) HA + -3y7(6, x, 9) 
‘© (closure ander Markov’s Rule) HA 3yT\«, x, 9) 
<9 HAL We3yTe, x, 9) 
(Friedman (1977) showed that this fact also holds in intuitionistic set theory, IZE.) 
Finally it is worth mentioning that Godel also introduced translation of 
intuitionistic logic into modal logic; in this translation, the necessity operator has the 
favour of ‘is provable.” (Recently Artemov (2001) made the provability explicit in a 
refined version.) 
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L141. Set theories 


Following the tradition in set theory, the various intuitionistic modifications of ZF 
are also first-order. For most classical theories, one can consider one or more cor- 
responding intuitionistic theories. In some cases, it suffices to omit PEM from the 
logical axioms, although one has to be careful when some non-logical axioms are by 
themselves strong enough to imply PEM. 

Here is an example: the full axiom of choice implies PEM. Let A be a statement, 
define 





P=(mENj v(n=laA)) 
Q=InEN|n=1v(n=02.)) 
S=1X¥) 


Since obviously 
VXE S3vE N(x X) 
AC would yield a choice function F such that 
WXE F(x) EX) 
Observe that F(P), FQ) EN, so 
AP) =FiQ)v FP) # RQ) 


If FUP) = F(Q), then A holds. If, on the other hand F(P) # F(Q), then +A. So, 
Aya" 

Friedman studied IZP, an intuitionistic version of ZF, fora formulation, sce Beeson 
(1985, ch. 8, section 1). The axioms for sets are modifications of the traditional ZF 
‘ones [see chapter 3], but the theory is very sensitive to the formulation: a wrong choice 
of axiom may introduce unwanted logical principles. For example, €-induction is 
used rather than Foundation, and similarly the Axiom of Collection takes the place 
lof the Replacement axiom. As noted above, Friedman (1973) showed that the Gédel 
translation theorem works for 3 suitable formulation of intuitionistic set theory. 

‘Aczel considered another version of constructive set theory, CZF. This set theory 
thas the attractive feature that t is interpretable in a particular type theory of Martin- 
Laf (Troeistra and van Dalen, 1988, p. 624). 

Set theory under the assumption of Church's Thesis was extensively studied by 
McCarty (1984, 1986, 1991). He built a model of cumulative constructive set 
theory in which a number of interesting phenomena can be observed, in 
Kleene’s realiazability universe sets with apartness are subcountable (ie., the range 
of a function on NN). 
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11.5. Second-Order Logics 


Whereas first-order intuitionistic logic isa subsystem of clasical logic, higher-order 
logic may contain rules or axioms that are constructively justified, but which contradict 
clasical logic. In practice, there are two ways to formalate second-order logic: with 
set variables and with function variables. 


ILS. 1QC? 


Second-order logic with set variables is a straightforward adaptation of the classical 
formulation (see chapter 2}; see van Dalen (1997) and Troelstra and Schwichtenberg 
(1996, ch. 11). Iris a surprising fact that in TQC" the connectives are not independent 
a in first-order logic. Prawitz (1965) showed that one can define the connectives in 
Vi, v? and > where X is a O-ary predicate, 


Loo Vxx 

ANB © WX(A+ (BX) 9X) 
AVB @ WX(AX)a(B->X) 9X) 
yA WXIV'(A> X) 9X) 
BYA © VXVIY(A +X) X) 


Classically, sets and functions are interdefinable as each set has a characteristic function, 
bur, intuitionistcally, S has a characteristic function if its membership is decidable: 


eS eo kia)=1 
Ja eS o> k(a)=0 


Since 





=1vk(a)=0 


fone has  € Sv a € S. The moral is that there are lots of sets without characteristic 
functions. Note that the ‘set”-approach and the ‘function"-approach to second-order 
logic (arithmetic) yield diverging theories. Generally speaking, total functions, with 
their ‘input-output behavior are more tractable than sets. That is a good reason to 
study second-order arithmetic with function variables. Another reason is that this 
formulation is a natural framework for treating Brouwer's choice sequences." 
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11.5.2. The theory of choice sequences 


For the practice of intuitionistic mathematics, second-order arithmetic with function 
variables is even more significant than the version with set variables. The reason is 
that this theory allows one to capture the properties of Brouwer’s choice sequences, 
Since this survey is about logic, the topic is not quite within its scope. This section 
therefore just briefly notes the main points; for more information, the reader is 
referred to the literature. 

Brouwer's chief contribution to this part of intuitionistic logic is that he realized 
that particular quantifier combinations are given a specific reading. Choice sequences 
of, say, natural numbers are infinite sequences, @, of natural numbers, chosen more 
cor less arbitrarily. That isto say, in general, there is no law that determines future 
choices. Sappose now that it has been shown that Va3nA(a, n) for some formula 
A; this means that when a choice sequence « is generated, one will eventually be 
able to compute the number », such that Aa, n) holds. Roughly speaking, this says 
that, at some stage, in the generation of a, all the information needed for the 
computation of » is available, but then no further information is needed, and any B 
that coincides so far with ot will yield the same m. This sketch is necessarily somewhat 
simplified; for an extensive analysis sce van Atten and van Dalen (2000). 

This, in a nutshell, is Browmer’s Continuity Principle 





Vax A( a, x) > VardxyvB\ (ay) = By) + ACB, x)) 


Here (9) stands for the initial segment of length y of a sequence 7. On the basis 
Of this principle, formulated for ¥a3!x in Brouwer (1918), already a number of 
intuitionistic facts, conflicting with classical mathematics, can be derived. It allows, 
for example, a simple rejection of PEM. 

In later papers, Brouwer further investigated the continuity phenomenon. He 
added a powerful induction principle, which enabled him to show his famous Con 
tinuity Theorem: all total real functions on the continuum are locally uniformly 
continuous (Brouwer, 1924b). As a corollary, the continuum cannot be decom 
posed into two inhabited parts. Another well-know consequence is the Fam theorem 
(basically the compactness of the Cantor space). 

In the 1920s, Brouwer introduced an extra strengthening of analysis, the details 
of which were published only after 1946. The new idea, known by the name ‘the 
creating subject’, was formulated by Kreisel in terms of a tensed modal operator. 
Subsequently, Kripke simplified the presentation by introducing a choice sequence a 
that ‘witnesses’ a particular statement A. a keeps track of the success of the subject 
in establishing A; it produces zeros as long as the subject has not established A, and 
when A is proved, or experienced, a produces a single ‘one’ and goes on with zeros. 
‘The existence of such a a is the content of Kripke’s schema: 


KS 3a(Ae 3ea(x)=1) 


Brouwer used the creating subject (and implicitly Kripke’s schema) to establish 
strong refutations, which go beyond the already existing Brouwerian counterexamples. 
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He showed (Brouwer, 1949), for example, that 
AVE R(4x>04x>0) 
and 
AVEE Rix 0 x40) 
See also Hull (1969). 


Kripke’s schema has significant consequences for the nature of the mathematical 
universe. It conflicts, for example, with Va3B-continuity, 








Va3pA(a, )>3FVada, F(a) 


where Fis a continuous operation (Myhill, 1966). Using KS, van Dalen (1999b) 
showed that Brouwer’s indecomposability theorem for R can be extended to all 
dense negative subsets of R (ACR is negative if Vx(x€ Aw wx A). So, for 
‘example, the set of not-not-rationals is indecomposable. 

‘The technique of the creating subject is used here to demonstrate that, under 
the assumption of Kripke’s schema, one can show a converse of Brouwer's indecom: 
posability theorem: KS +R is indecomposable => there are no discontinuous real 
functions. 


Proof Let f'be discontinuous in 0, and f(0)= 0. It follows that 
BuY nAx(| x] <2 AL fx) |>24) 


Hence there are x, with | flx,)| > 2° and |x] <2". Let abe the Kripke sequence 
for r€ Q, and B for r€ Q. Pur 





te = ain) 
[y(2m + 1) = Bim) 
if Vpn r(p)=0 
if p< mand y(p)=1 
= lim(«,). 


Now, fle) <2 fle) > 0. If fle) <2" then fle)=0, so ¥p(7p) = 0). Contradie- 
tion. If fle) > 0, then r€ Qv r€Q. Hence Vr(r€ Qv r€ Q). This contradicts 
the indecomposabilty of B. QED 

‘The theory of choice sequences has received a great deal of attention since Kleene 
and Kreisel formulated suitable formalizations. Some notions, such as “lawless se- 
‘quence,’ have found important applications in metamathematics; see, for example, 
Brouwer (1981, 1992) or van Dalen (1986) or Trocistra and van Dalen (1988) 
and the references cited there. There is alo an extensive literature on the semantics 
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‘of second-order arithmetic; see, for example, the references cited in subsection 
113.3. 


‘Suggested further reading 


For « more comprchensie treatment of intuitionistic logic, the reader may want t0 consult 
the author (van Dalen, 1986, 1997) or Kleene (1952). For the more mathematical and meta 
‘mathematical aspects, sce abo Troekstra and van Dalen (1988), and Heyting (1934, 1956), 
‘Dummett (1977), Bridges and Richman (1987), as well as Philaephia Mathematica, vol 6 
Special Ie: Perspectives on ntuitioniam (ed. R. Teszen). In addition to these works, for 
proof theory, the reader might consult Beeson (1985), Buss (1998), Girard (1989), Martin 
Lat (1984), or Troelitra and Schwichtenberg (1996). Similarly, for more oa semantics, see, 
for example, Dragalin (1988), Fitting (1969), Fourman and Scort (1979), Gabbay (1982), 
MacLane and Mocrdijk (1992), Rasiowa and Sikorski (1963), and Smorynski (19734). The 
works of Dummett, Gabbay, Klcene, Smorynki, and Trocstra and van Dalen just cited are 
abo useful for material on fint-order theories. Finally, second-order logic and choice se 
quences are further treated in Brouwer (1981, 1992), Kleene and Vesley (1968), and Trocstra 
(1977), as well a» works already mentioned, €-., van Dalen (1986), Dummett (1977) and 
“Troelstra and van Dalen (1988), 





Notes 


1 Even though Brouwer did not develop logic for its own sake, he was the first to establish 
2 noo-trival result: A +4 A (Hrouwer, 1920). 

2. For a survey of the Brouwerian global philosophy of man and his mathematical enter: 
prise, see van Dalen (19993), 

3. This kind of interpretation of the connectives in terms of proof was made explicit by 
Heyting (1934). Kolmogorov (1932) had given a similar interpretation in terms of 
problems and slntions The formulations are, up to terminology, virtually ential. 

4 Brouwer (1924a) considered various sequences in this expansion. For one of them, the 
‘occurrence has been established: 01234567890 does indeed occur among the decimals 
of x (Borwein, 1998), Nevertheless, in spite of considerable computational power, there 
are sill enough open questions concerning the occurrence of specific sequences of 
decimals 

5 Hyting’s formalization was published in 1930, Glivenko (1929) and Kolmogorov (1925) 
had already published similar formalizations, which, however, did not cover full itu: 
tionistic logic (Troelstra, 1978). 

6 Normalization and cut-climination for second -onder logic i fr more complicated because 
Of the presence of the comprehension rule, which introduces an impredicativity. The 
proof of a normalization theorem was a spectacular breakthrough; the names to mention 
here are Girard, Prawite and Martin-LOf, se their papers in Fenstad (1971). 

7 The theory of topological interpretations is treated extensively in Rasiowa and Sikoeski 
(1963). 

8. The pioneer in this area was Lawvere, who saw the posbiltes for treating logical 
notions in a categorical setting; see, for example Lawvere (1971). He, in particular, 
discovered the significance of adjointness for logic. 

9 Details ofthe logic can be found in Troelstra and van Dalen (1988) and Scotts original 
(1979), 
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10 This raul was fist established by proof theoretic means in van, Dalen and Statman 
(1978), and subsequently an elegant model theoretic proof was given in Smorynski 
(19736), 

1 This fact was discovered by Diaconescu (1975); the proof given here is Goodman and 
Myhill’s (1978), 

12 For he proof theory of (QC, sce Praitz (1970), Troctra and Schwichnenberg (1996), 
and Teoebta (1973). 
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Chapter 12 


Free Logics 
Karel Lambert 


12.1. Preliminary Remarks 


‘The expression ‘free logic,” coined by the author in 1960, is an abbreviation for 
‘Togic free of existence assumptions with respect to its terms, singular and general, 
bbut whose quantifiers are treated exactly as in standard quantifier logic.’ In more 
traditional language, such logics do ot presume that either singular or general 
terms ~ the two distinct categories of terms emphasized in modern logical grammar 
= have existential import. A singular term ‘*" has existential import just in case 
exists (or, equivalently, there exists an object the same as f) and a general term (or 
predicate) “G” has existential import justin case G exist (or, equivalently, there exists 
an object that is G).' Examples from colloquial English customarily taken to be 
singular terms are expressions such as ‘Socrates’, ‘the planct causing perturbations in 
the orbit of Mercury’, °S", ‘5/0°, ‘the square of 3° and ‘having a hear’, Some of 
these do not have existential import ~ in particular, *5/0" and ‘the planet causing, 
perturbations in the orbit of Mercury’. Examples from colloquial English custom: 
arily taken to be general terms are expressions such as ‘is a philosopher’, ‘is a planet 
causing perturbations in the orbit of Mercury’, ‘number’, ‘is divisible by 0”, and ‘has 
‘heart’. Some of these general terms do not have existential import ~ in particular, 
‘is a planet causing perturbations in the orbit of Mercury” and ‘is divisible by 0°. To 
say that the quantifers are treated exactly as in standard quantifier logic is to say, 
roughly, that the operator symbol ‘3" (the existential quantifier) reads: ‘There exists 
an object’, and the operator symbol “W” (the universal quantifier) reads: “Every 
existent object’ 

A distinctive property of free logics is rejection of the principle of standard first- 
order quantificational logic called universal specification (or its inferential counter- 
part, the rule of universal instantiation) [see chapter 1]. An instance of universal 
specification is 











Wx(P(x)) D PC) a2.) 
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where ‘P” is a general term (or, in this case, a one-place predicate), “1” is a singular 
es (esc of dene eee tic sere ann then 

For instance, (12.1) might be the statement 

If every existent object (x is such that x) perishes, then Caesar perishes. (12.1*) 
Similarly, an instance of nule of universal instantiation is 

Ve( P(x); 90 Pls) (22) 
For example, (12.2) might be the inference 

Every existent object (x is such that x) perishes; so Caesar perishes. (12.2%) 
‘Typically, instead of these, free logics adopt a restricted principle of universal specifica- 
tion (or its inferential counterpart, restricted universal instantiation). For instance, in 
place of (12.1), one customarily finds 

(Res 1) Wx(P(x)) D (c exists > PCH) 
where an instance of (Res 1) would be 


(Res 





) If every existent object (x is such that x) perishes, then Caesar 
perishes provided Cacsar exists 


Similarly, in place of (12.2), one typically finds 
(Res 2) Wx( P(x); 90 (¢ exists > P(e) 
For example, 


(Res 2*) Every existent object (x is such that x) perishes; so Caesar perishes 
provided Caesar exists. 


‘As suggested in the first paragraph above, if identity is in the language, ‘t exists’ can 

be eliminated in favor of “3x(x= )". In particular, ‘Cacsar exists’ can be taken as 

shorthand for “There exists an object x (such that x) is the same as Caesar’, 
Some, but not all, free logics are wnivermly free. In universally free logics, state- 


ments such as 
Wx(P(x)) > 3x(P(2)) (2.3) 


are not regarded as logically true because they require the assumption that there 
exists at least one object, oF, in more conventional parlance, they require the assump- 
tion that the domain (or universe) of discourse (construed as the set of existent 
objects) is nonempty. But itis important to notice that rejection of the logical truth 
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of a statement like (12.1*), and acceptance of the logical truth of a statement like 
(Res 1*), concerns their constituent singular terms; it is presumed only that their 
constituent singular terms have existential import. They do not, as does acceptance 
of (12.3) as a logical truth, require the domain (or universe) of discourse to be 
nonempty. Indeed it is common to sce statements such as (12.3) adopted as a 
logical truth in many systems of free logic. Such free logics, thus, are not universally 
free. 

Free logics are compatible with a pair of conflicting world pictures (essentially, 
interpreted model structures). One kind, the actualist world picture, depicts the 
inhabitants of a world (members of the domain, or universe of discourse, of a model 
structure) to be of only one kind, the existent objects. An exponent of this kind of 
world picture is the British philosopher Bertrand Russell. The second kind, the 
nonactualist world picture, depicts the inhabitants to be of two kinds, those that 
exist and those that do not exist. An exponent of this kind of world picture is the 
Austrian Philosopher Alexius Meinong. In the first kind of world picture, to say that 
«singular term has existential import amounts to saying that its purported referent 
is an inhabitant, but this is not so in the second kind of world picture. For singular 
terms, not having existential import is not necessarily to be equated with not having. 
a referent or not denoting. 

Free logics can be divided into three classes depending on how atomic statements 
containing singular terms not having existential import are evaluated for truth-value. 
[Negative free legics are those free logics in which all atomic statements containing at 
least one singular term not having existential import are evaluated false, This species 
of free logic has become a very common logical foundation for certain kinds of 
programming languages. Positive free legice are free logics in which some atomic 
statements containing singular terms not having existential import are evaluated 
true. This species of free logic also serves as a logical foundation for some pro- 
‘gramming languages, LISP, for example. Neutral free legics are free logics in which 
atomic statements, except perhaps those of the form ‘¢ exists’, containing at least 
fone singular term not having existential import, are evaluated as truth-valucless. 
‘Those who do not accept a Meinongian world picture consider this species of fee 
logic more congenial to normal inference obeying the Fregean principle that the 
value of a complex expression is a function of the values of its parts; see, for 
example, Lehmann (1994), 

How the truth-value of atomic sentences containing singular terms without exis- 
tential import are to be evaluated affects quite dramatically one’s account of logical 
truth and valid inference. For instance, consider Descartes's Cagito ergo sum. In. 
negative free logics, the conditional ‘If I think, then I am’ is logically true and the 
inference from ‘I think’ to ‘I am’ is valid, but in positive free logics (based on a 
nonactualist world picture), the conditional in question is not logically true and 
the inference is invalid. In the negative free logic case, any substitution instance for 
the singular term ‘P’ thar does not have existential import, in ‘I think’ will make the 
premise false, and hence the inference valid. But, in the case of postive free logics, 
substitution, for example, of the grammatically proper name ‘Vulcan’, a singular 
term not having existential import, for ‘I’ in both premise and conclusion of the 
inference in question and ‘is the same as’ for ‘think’ in the premise, yields a true 
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premise, but 2 false conclusion. On the other hand, there are positive free logics 
based on actuals world pictures in which the inference from ‘I think? to “I am is 
valid, but the conditional ‘if I think, then 1 am" is not logically rue (Bencivenga et 
al,, 1991, pp. 115-16). 

Free logics do not presume that those singular terms falling in the class of gram- 
matically proper names are reducible to definite descriptions. They do not presume 
that ‘exists’ in singular statements such as “Cacsar exists’ is a predicate, for there may 
be free logics in which the expression ‘exists’ docs not even occur in the language 
(Lambert, 19632). They do not presume that the interpretation of the quantifiers 
is objectual, ic, that the truth of quantifcational statements is given by clauses 
appealing to the values of the variables; such clauses might instead appeal to a 
subclass of the substituends of the variables, namely, those that have existential 
import. Finally, to re-emphasize a point made above, free logics do not presume 
there is any intimate connection between a singular term ‘#” having existential 
import and ‘" referring. 

Despite anticipations in the first half of the twentieth century, ¢g., by Rosser 
(1939), concentrated technical and philosophical study of fre logics dates only from 
the mid-1950s.’ Their genesis and leading principles may be explained as follows, 


12.2. Genesis and Leading Principles 


Let ‘S? and *P* be place holders for general terms, expressions purporting to be true 
(or false) of each of possibly many objects. ‘The Port Royal theory of immediate 
inference, a theory that flourished in the vicinity of 1662 and which derived ult 
‘mately from Aristotle, counted the inferences of the following forms valid: 


(A) All Sare P 
= (I) Some S are P 


(E) No Sare P 
* (O) Some S are not P 


Moreover, that theory classified as valid inferences from an (A) statement to the 
negation of an (O) statement, and vice versa, and from an (E) statement to the 
negation of an (I) statement, and vice versa 

It is a commonplace that these inferences break down when (A ) and (E) state- 
‘ments are interpreted as universal conditionals and (I) and (O) statements ae inter- 
preted as existential conjunctions, unless at least the placeholder *S” is restricted t0 
‘general terms having existential import ~ general terms true of at least one existent 
object. For corroboration, let *S* be the general term ‘planets between the Earth 
and the Moon’, a general term without existential import, and let “P” be the general 
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term ‘in solar orbit’, a general term that does have existential import. In the lan- 
guage of traditional logic, the validity of the inferences described earlier is preserved 
by requiring that all statements of the four basic forms have existential import with 
respect to their constituent general terms. 

This policy has adverse consequences. First, the scope of the Port Royal theory of 
immediate inference is thereby restricted, which thus precludes its use in assessing 
the validity of inferences containing general terms without existential import in 
subject position. For instance, under the current interpretation of the four basic 
statement forms, the theory cannot be applied to many inferences containing state~ 
ments of physical law. The statement 


All bodies on which no extemal forces are acting move uniformly in a given 
direction 


is such a statement because the general term ‘bodies on which no extemal forces are 
acting” lacks existential import. Second, the Port Royal theory allows no distinction 
between inferences whose validity requires the assumption that at least its general 
terms in subject position in the various statements making up the inference have 
existential import from that inferences whose validity requires no such assumptions, 
For example, inferences from (A) statements to the negation of (O) statements, and 
vice versa, require no such assumption, but inferences from (A) statements to (I) 
statements do. 

In the modem logic dating from Frege, object language counterparts of general 
terms (or predicates) with and without existential import became available: a general 
term (or predicate) has existential import just in case there exists an object + such 
that x is S, otherwise it does not have existential import. (A) to (1) and (E) to (O) 
inferences are modified to hold on to the additional assumption that there is an 
object such that x is S, but the mutual inferability between (A) statements and the 
negation of the corresponding (O) statements, and between (E) statements and the 
negation of the corresponding (I) statements do not require any such assumption, 
Given these object language counterparts of general terms (or predicates) with 
(and without) existential import, itis now customary to say that the modem theory, 
in contrast to the Port Royal theory of immediate inference, purports to be “free of 
existence assumptions with respect to its general terms."* 

Nevertheless, the modem logic aso faces a similar problem, but in its treatment of 
singular inference. Where ‘t* is a singular term, the following inference (called 
‘universal instantiation,” UI, after its rule counterpart): 


UL For all objects x, xis such that xis S 
tis 


is valid in the modem logic. But, as has often been noted, the validity of this 
inference is threatened by singular terms without existential import. Let ‘f” be the 
singular term ‘Vulcan’, an expression purporting to name a certain planet causing 
the perturbations in the orbit of Mercury, and let “S” be “there is a object ysuch that 
xis the same as 77. Then the premise 
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For all objects x, there is an object y such that x isthe same as y 
of UL is true but its conclusion, 
‘There is an object y such that Vulcan is the same as 


is fale. Indeed, in a dialogue with Frege in the late 1800s, the validity of UI was 
challenged on similar grounds by Punjer. Frege's response (1969) has become the 
standard response when ‘Vulcan’ is taken to be a genuine singular term, namely, 
that in logic expressions that occupy the place of *f” are presumed to have existential 
import. 

This policy on singular inference suffers from essentially the same difficulties 
previously noted with respect to the Port Royal theory of immediate inference. Firs, 
it restricts the scope of the modem theory of singular inference. For instance, the 
modem theory of singular inference cannot be applied to the inferential ruminations 
of astronomers prior to the discovery of Leverrir that there is no object that is 
Vulcan; it cannot adjudicate the worth of the inference 


‘Vulcan is the planet causing the perturbations in the orbit of Mercury. 
‘That planct will be at location L at 10.00 PM. 
So, Vulcan will be at location L at 10.00 PM. 
Second, it cannot discriminate between inferences, like that just depicted, whose 
validity does not require that their constituent singular terms have existential import 
from those, like Ul above, whose validity does 
In the modification of the modern theory of singular inference called free logic, 
object language counterparts of singular terms with and without existential import is 
readily available; a singular term ‘ has existential import just in case there exists an 
object x such that xis £ (or, more briefly, just in case # exists). The inference Ul is 
valid only with the additional premise that ¢ exists. That is, the restriction that ‘t" 
‘must have existential import is lifted in free logic, and the inference UI is replaced 
by 
RUL For all objects x, xis such that x is S. 
1 exists, 
otis. 
‘This restricted form of universal instantiation isa characteristic feature of free logics. 
(Indeed, it is easily shown that UI above is valid just in case 


‘There exists an object x, such that x is £ 


is valid.) 
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It is appropriate, then, to say that just as the modern logic of general inference 
purports to be free of existence assumptions with respect to its general terms, free 
logics are free of existence assumptions with respect to their general and singular 
terms. It may thus be construed as the culmination of an attitude toward the logic 
of terms most fully expressed in the logic dating from Frege, and, indeed, implicit in 
his response above to Punjer. 


12.3. Proof Theory 


Proof theoretical developments of free logic typically ae of two sorts depending on 
whether the primitive predicate of singular existence, “E", is available. For con- 
venience, axiomatic formulations of both sorts will be presented here. It should be 
noted, however, that natural deduction and tableau (or tree) versions of every 
species of free logic are widespread.* 

‘The vocabulary and grammar of a (first-order) free logic is not essentially different 
from that of a standard (first-order) predicate logic [see chapter 1] except for the 
possible presence of a singular existence symbol, “E!", whereby “E!t" would be well- 
formed when ‘s” was any individual term. ‘a’, “6, “¢, etc, represent singular terms 
(individual constants); ‘x, ‘7, "2, ete. represent individual variables; *f,‘r, ete. may 
refer to any individual term, constant or variable; and unless specified otherwise, ‘A’, 
“BY *C’,... ete, represent formulas with or without free variables. Statements are 
closed formulas. *A(s/2)’ refers to the result of substituting ‘s for ‘fin A 

‘The transformation rules of the formal system PFL, without “E!” in its primitive 
‘vocabulary are these. An axiom of PFL, is a tautology or any closed statement of the 
following forms: 

















MAL AD VxA 
MA2 — Vx(AD B) 2 (Vx ¥xB) 
MA3_—Vy(VxA.D A(y/x)) 
MAS Vx¥yAD Vy¥xA 
MAS VxA(x/a) if A is an axiom’ 
‘The only rule of inference is 

D_ From A, AD B, infer B. 


‘As usual, a derivation of a statement A from a set of statements S is a sequence 
(Ay... 6A.) such that 


(i) AsA, 
(ii) Aisa member of S, or A, is an axiom, or A, is the result of previous members 
Of the sequence in accordance with D. 
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If Sis empty, A is a theorem in PFL, and the sequence (As,..., A.) is a proof of A 
in PFL,. 

When the vocabulary of PFL, contains an identity predicate and formulas += f 
then if the transformation rules of PFL, are supplemented by at least 


MAG a=b3(ADAls/a)) 


where “A(4//a)’ is the result of replacing a at one or more places in A by &, if at all, 
then the restricted principle of universal specification ~ the statement counterpart of 
the rule of restricted universal instantiation ~ 


RUS VxAD(Et8> Ala/x)) 
is derivable with the help of the definition 
Defl Elr=y3x(x=) 


Moreover MA4, the principle of universal quantifier permutation, is also eliminable 
from the primitive frame of PFL,. This system (PFL,.) is philosophically interesting, 
because it provides a non-modal motivation for the principle MA3, a. principle 
independently recommended in Kripke’s treatment of quantified modal logic [see 
chapter 7]. 

‘Another formulation, PFL.», in which “E!” is a primitive symbol without identity, 
is readily obtainable from the core of PFL,.* To obtain the most typical version, sans 
identity, it is sufficient to replace MA3 and MAS in the primitive frame of PFL,, 
respectively, by RUS (= MA3*) and 





MAG VsEte 


To obtain the standard version of PFL, with identity (PFL,.), it is sufficient 10 
supplement PEL, with MA6 and 


MA7 asa 

See, for example, Meyer and Lambert (1968). One consequence of this formulation 
is formal justification for the definition Defl, as Hintikka (1959) was the first to 
show; i.e., the biconditional 


HT 





is derivable in PL... Himtikka’s discovery assumes greater importance in developments 
like PEL, (and indirectly in PFL;) given the discovery by Meyer et al. (1982) that 
Ela is indefinable in PFL, itself. Much tradition to the contrary notwithstanding, 
these two results show that the assertion (or denial) of, say, the planet Vulcan’s 
existence can only be effected, in standard logics sans identity but freed of existence 
assumptions with respect to their terms, by use of the general term ‘exist’. 
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‘The formal systems above are positive free logics, intuitively reflecting, on appro- 
priate interpretation, that some atomic sentences containing singular terms without 
existential import are true. There are, however, formal systems of free logic - nega- 
tive free logics ~ which are intended to reflec the intuition that all such sentences 
are false. To represent this in the object language, virtually all versions of negative 
free logic (NFL) have a primitive existence symbol if identity is not available in the 
language.” So, corresponding to PFL,., there is the system of negative free logic 
NFL,,. It is obtained from PFL,. by substituting for the meta-axioms MA4* and 
MA7, respectively, the meta-axioms 





MAG** —Va3y(x=y) 
MA7t  Vx(x=%) 


and adding the meta-axiom 


MAS+ —Alays..5 ) > (Bx(x= am) &-- Sees a)), where w= 1 and 
Ala, : 








yatem, due in essentials to Burge (1991 [1974], p. 192), permits contexts of 
the form ‘E!s’ to be introduced as in Defl or, alternatively, by the definition 


Def2 Buy rer 


Also, as in PFL,., RUS can be reduced to the status of a derived principle. MA7 no 
longer holds - fling, in virtue of MAS+, when ¢ is a singular term not having 
‘existential import. Of special note is the fact that 


Ala/x) 2 3A if A(a/x) is atomic 


is derivable, a principle which is not derivable (nor is logically true) in any positive 
free logic. This is, perhaps, the most important difference between negative and 
positive free logics 

‘When ‘ED is taken as primitive, corresponding to PFL; is the system of negative 
free logic NFL, It is obtained by appending to the set of meta-axioms in PFL;, the 


meta-axiom 





MAS+* Ala...) D (Bly 8 --- BE! 
is atomic.* 


Similarly, corresponding to the system PFL,., there is the system of negative free 
logic NFL... Itis obtained by adding to NFL, the meta-axiom MA6, and substituting 
the meta-axiom 


where = I and Alay... 545) 


MA7* — Vx(x=x) 
for MAT of PEL,... See Scales (1969, esp. pp. 11-12). 
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12.4. Model Theory 
‘Typically, model structures for fice logics are of two kinds. The simplest kind ~ an 


FM, mode! structure — is an ordered pair (D, f), where the domain D is a (possibly 
‘empty) set of objects, and the interpretation function f is a function such that 





(i) where a is singular term, f(a) is a member of D, if fla) is defined 
(ii) where Pis an m-adic predicate, f(P) is a set of m-tuples of members of D 
(iil) every member of D has a name. 


Customarily D is construed as the set of existent objects, and hence this represents 
the actualist or *Russellian’ ontological picture described above. In the characteriza: 
tion of an FM, model structure, (i) reveals f to be to be a partial function;’ it 
typically represents the possibility that a singular term may not have existential 
import. (ii) is merely a convenience to enable the clause for the universal quantifier 
in the ensuing definition of truth in an FM, model to be given a substitutional 
‘characterization; itis adequate for most logical purposes. 

“Models of the FM, variety are exploited in all three kinds of free logic, positive, 
negative and neutral. To illustrate this, consider first the definition of truth for a 
negative free logic with identity such as NFL. 

AA statement is true (or false) in an FM, mode! just in case these conditions obtain 
(Burge, 1991 (1974), p. 194) 


(i) IF A has the form P(a,,..., a,) then 

f(a), fla) is defined and (fia, 

wise A is false in that model. 

If A has the form a=, then A is true in an FM, model if f(a) and £(4) are 

defined and f(a) is the same as f(b); otherwise A is false in that model. 

If A has the form ~B, then A is true in a FM, model if Bis false in that model; 

otherwise A is false in that model. 

(iv) If A has the form B5 C, then A is true in an FM, model if Bis false in that 
model or C is true in that model; otherwise A is false in that model 

(v) If Ahhas the form VxB, then A is true in an FM, model if B(a/x) is true in 
that model for all a such that f(a) is a member of D; otherwise A is false in 
that model 





true in an FM, model if each of 
+ fla) is a member of f(P); other 











To obtain a definition of truth in an FM, model for a negative free logic with a 
primitive existence symbol but without identity ~ such as NEL, above — it suffices to 
replace clause (i) in the definition above by 





( 





) IFA has the form Eta, then A is true in an FM, model i 
model; otherwise A is false in that model. 


(a) is defined in that 


Soundness and completeness of NFL,. (with and without the primitive symbol E!) 
based essentially on this kind of semantics has been readily established (Schock, 1968). 
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For a system of positive free logic like PFL;. above, another kind of definition of 
truth based ultimately on an FM, kind of model structure utilizes van Fraassen’s 
widely exploited notion of ‘supervaluation.’ Here it is presented as augmented by 
Bencivenga (1991 [1980]); see the references there to van Fraassen’s original work. 

Fins, the notion of a classical model structure is needed. AA classical model struc- 
ture is exactly like an FM, model structure minus the provision “if f(a) is defined’ in 
‘lause (i). In short, in classical models, every singular term is assigned some member 
of the domain D; there are no terms without existential import.” Nevertheless, in 
this approach, under the constraint of the facts, such models can be used to help to 
assess the truth-value of statements containing singular terms that do lack existential 
import because they can be viewed as ways of completing an FM, model. 

Second, a complete model structure FM; based on an (incomplete) model structure 
FM, =(D, f) may now be defined as a pair (D‘, such that 


(i) Dr iss nonempty and has D as a subset 
and the interpretation function f* obeys the conditions that 


(i) (a) is a member of DF 
(iil) f(a) = f(a) wherever f(a) is defined 
(iv) (2) CPP) for every madic predicate P. 


‘A complete model structure FMj based on FM, is a completion of FM, 
‘Third, the truth-value of a statement (true, false, oF neither) in an FM, model 
(D, f) can now be defined progressively in stages, the third and last of which invokes 
the notion of a supervaluation 
Where A is an atomic statement of the language: 


(i) if Ahas the form P(a,..., ,), and ifall of f(a,), ...,f(a,) are defined, then 
‘Ais true in the FM, mode! just in case (F(a), ..-, (a,)) isa member of f(P), 
and otherwise A is false therein 

(ii) if A has the form in (i), and at least one of f(4,),.... f(a) is undefined, then 
A has no truth-value 

(iii) if A has the form Eta, then A is false in the FM, model iff (if and only if) f(a) 
is undefined 

(iv) if both of f(a) and £(4) are defined, and A has the form a= 6, then A is true 
in the FM, mode! if f(a) is the same as f(6), and otherwise A is false therein 

(¥) _ifexactly one of f(a) and f(6) is undefined, and A has the form a=, then A 
is false in an FM, 

(vi) if neither of f(a), £(6) are defined, and A has the form 
‘ruth-value in an FM, model. 











=, then A has no 


Intuitively, the foregoing definition may be construed as providing the basic factual 
information upon which to calculate the truth-value of any statement in the formal 
language. Clauses (ii) and (v) are important because they show statements which 
are factually false, in some uncontroversial sense of ‘actual’, even when containing a 
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singular term without existential import. The next definition takes seriously the 
notion of a completion of an FM, model into consideration. 

Where A is a statement of the language, and M° is a completion of an FM, model 
M, A is true or false in M* visd-vis M under the following conditions: 


if Ais an atomic statement and has 2 truth-value in M, then the truth-value of 

‘A in M® vird-vis M is the same as the truth-value of A in M 

(i) if Aisa truth-valucess atomic statement in M, then the truth-value of A in 
(M® ird-ris M is the same as the truth-value of A in M* 

if A has the form of ~B, then A is truc in M* vird-vis M if Bis false in M* 

visivis M 

(iv) if A’has the form B C, then Ais true in M* vind-vis M iff Bis false in M* 
vir-vis M or Cis true in M* virs-vis M 

(0) if A has the form of VxB, then A is true in M* vira-vis M iff Bla/x) is 
true in M* virsivis M for all singular terms a such that Eta is true in 

vie-vis M. 











‘Clause (i in this definition shows that the facts cannot be overridden in a comple~ 
tion. For example, if Eta should tum out false in an FM, model M in virtue of the 
singular term ‘a’ having no existential import, it has the same value in M* vired-vis 
‘M even though in M* (a kind of model in which all singular terms have existential 
import) Eta is truc. This is the formal force of the phrase “under the constraint of 
the facts? above. 

Given a FM, model M, the supernaluation S over Mis the set of all completions 
‘of M, Truth, falsity and neither in a superraluation can now be defined as follows: 
Where A is a starement of the formal language, and Sy is the supervaluation S 
over the FM, model M, then: 








Ais true in Sy iff is true in M* rirsi-ris M for every completion M* based 

on M 

(li) Als false in Sy iff A is false in M* vis-a-vis M for every completion M* based 
on M 

(iii) Ais neither true nor false in Sy iff A is true in M* visd-ris M for some M* 

based on M and A is false vird-vis M in others 





In this kind of semantics, supervaluations are the admissable valuations, ie, they are 
the valuations in terms of which logical truth and validity are defined."’ Soundness 
and (weak) completeness of essentially PFL;. with a primitive existence symbol are 
readily available.” 

Finally, a definition of truth based of an FM, kind of model for neutral free logic 
is also possible. Following Lehmann (1994), ‘V" and ‘3° replace ‘D" and *Y" as 
primitive signs in what follows: 


If A has the form Pa ), then if f(@,),..., f(a,) are defined, then A 
is truc (false) in an FM, mode if (F(a)... (4,)) is a member of f(P) (is not 
a member of f(P)); otherwise A has no truth-value therein. 
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(ii) If Ahas the form a= 6, then A is true (false) in an FM, model if f(a) and f(6) 
are defined and f(a) isthe same as f(6) (are not the same); otherwise A has no 
truth-value therein. 

(iii) Tf A has the form ~B, then A is true (false) in an FM, model if Bis false (true) 
therein; otherwise A has no truth-value therein. 

(iv) If A has the form (Bv C), then A is true in an FM, model just in case B is 
true and Cis true, oF Bis fale and Cis true, or Bis true and Cis false; A is 
false therein if both Band C are false; otherwise A has no truth-value therein, 

(v) If A has the form 3eB, then A is truc in an FM, model if B(a/x) is true 
therein for some singular term a such that f(a) is defined; otherwise A is false 
therein, 


‘This definition of truth is a more conventional adaptation of Lehmann’s semantics. 
It yields, indirectly, a soundness and completeness proof of the Jefirey-like tree rules 
in Lehmann’s formulation of the proof theory for his version of neutral free logic."* 

‘The second kind of model structure, FM, that one finds in treatments of free 
logics takes M to be a triple (D,, Df), where D, (the outer domain) is a possibly 
cempty set, D, (the inner domain) is a possibly empty set disjoint from D,, whose 
union D, U D, is nonempty, and an interpretation function f such that 


(i) (a) is a member of D, UD, 
(ii) £(P), where Pis an m-adic predicate, isa set of n-tuples of members of D, UD, 
(iil) every member of D, UD, has a name. 





‘Typically the inner domain is interpreted as the set of existent objects, and the outer 
domain is construed as the set of nonexistent objects, and thus an FM model 
represents the inhabitants of the nonactualist, or Meinongian, ontology.* Inner 
domain-outer domain model structures, as these kinds of model structures have 
come to be known," are used most widely in the semantical developments of posi- 
tive free logics, and much less frequently in negative free logics; there are no known 
cases of such structures in semantical treatments of neutral fe logics. It is to be 
noted that, in the typical construal, the interpretation function f defined on the 
singular terms is total, and, hence, that a singular term can refer even when it is 
devoid of existential import 

A statement A (of the formal language) is true (or false) in an FM; model under 
the following conditions: 


(i) Aas the form Plays... 4), then A is trac in an FM; model if (F(a), «+ 
f(a,)) is a member of f(P); otherwise A is false therein. 

(ii) If A’has the form ~B, then A is true in an FM, model justin case 
lotherwise A is false therein 

(ii) TF A has the form BD C, then A is true in an FM, model if B is false therein 
for Cis true therein; otherwise A is false therein 

(iv) If A has the form WsB, then A is true in an FM; model if Bla/x) is true 
therein for all singular terms a such that f(a) is a member of D,; otherwise A 
is fase therein. 





is false; 
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Essentially, this semantics yields soundness and completeness (weak and strong) for 
PEL,."* When identity is added to the language, as in PF. it is necessary to add 
the condition 


(¥) If Ahhas the form a= b, then A is true in an FM, model if f(a) and f(4) are the 
same; otherwise A is false therein, 


to extend the completeness proof (Leblanc, 1982), For a formulation such as PFLa, 
to obtain soundess and completeness, it is necessary to add 


(v*) If A has the form Eta, then A is true in an FM, model if f(a) is a member of 
Dj otherwise A is false therein. 


in place of (¥). Given the addition of (v) and (v*) to the truth definition above, the 
soundness and completeness (weak and strong) of PFL,. are straightforward enough 
(Meyer and Lambert, 1968). 

FM, model structures have also been used in negative free logic, especially NFL.” 
‘The only diflerence from the definition of truth given above for positive free logic 
lies in clauses (i) and (v). These are replaced, respectively, by 


(i*) if A has the form P(a,,..., a), then A is true in an FM, model if each 
ff)... F(a.) is a member of D,, and (f(a)... f(a,)) is a member of 
£(P); otherwise A is false therein 

(v") if A has the form a= 6, then A is true in an FM, model if f(a) and f(b) are 
members of D, and f(a) is the same as f(b): otherwise A is false therein 








‘The import of (i*) and (y") is to make false any atomic statement containing a 
singular term without existential impor, for example, ‘Valcan isa planet’ and “Vulean 
= Vulcan’, in contrast to the definition of truth in an FM, model for a positive free 
logic like PFL,.. Soundness and completeness are readily established for NFL). in 
this semantic deve tJ 


12.5. Some Applications and Implications 


Applications of fice logics are wide and varied, ranging from the philosophy of 
religion at one extreme to programming languages at the other. A few of the more 
important are discussed below. 


125.1. Definite descriptions 
The earliest and most well known application is to the (logical) theory of definite 


descriptions. Free logics have provided, for the first time in nearly a half century, 
new foundations for such theories, foundations differing from those provided by 
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Russell, Frege, and Hilbert and Bernays. Virtually al free theories of definite descrip- 


tions add the following minimal principle to the underlying free logic with identity.’ 
Where ‘sis the definite description operator, 





MED — Vx(x= 1A = Vy(AD y=) 


MED does not hold in standard predicate logic, yet, in negative free logic with 
‘identity, it yields the analogs of the famous pair of definitions in Russell's theory of 
definite descriptions for atomic contexts as theorems without having to reject the 
singular term status of expressions of the form ‘s3A". Moreover, if a complex predi- 
cate forming operator is added to the language, so that scope distinctions can be 
‘made, then the analogs of the Russell definitions extended to all contexts can 
bbe derived as theorems (Burge, 1991 [1974]; Scales, 1969). Similarty, in positive free 
logics, the famous elimination theorem of Frege can be derived with the help of the 
additional extensionality principle, 





SText.  Va(xerm ees) De=5 


where 's* and ‘t” are singular terms (names or definite descriptions). The system 
containing this pair of principles, MED and SText, is known as FD2 in the literature 
(Lambert, 1963b, 1964; Scott, 1991 [1967]; van Fraassen and Lambert, 1967). 
Indeed, in positive free logics with identity, a whole hierarchy of definite description 
theories has emerged between the theory containing only MED as the minimal 
theory and FD2 as the maximal theory. The exact nature of this hierarchy is not yet 
‘well understood.” Finally, it has recently been shown that the four major traditions 
in the logical treatment of definite descriptions can be seen as reactions to the 
tunsoundness of the very natural principle MFD in standard predicate logic in the 
same way that various treatments of sets can be seen as reactions to the unsoundness 
of the principle of set abstraction in naive set theory (as Russell was the first to 
show) (Lambert, 19916). 





12.5.2. Presuppesition 


‘The importance of the semantical notion of presupposition in moder philosophical 
endeavors is hard to exaggerate, Two examples will suffice. First, Britan’s recon: 
struction of Kant’s theory of science relies heavily on the oft heard language that the 
(basic propositions of the) Categories ‘presuppose’ the propositions of (the current) 
mathematics (e.g., Euclidean geometry) and (the current) natural science (Brittan, 
1978, esp. pp. 28-42). Britan treats that notion in the Frege-Strawson sense and 
argues 

1 it is consistent with mach Kantian text, and 

2. allows Kant to escape, for example, the criticism that his metaphysical views are 

falsified by the subsequent development of non-Euctidean geomettics 
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Second, Lambert’ reconstruction of Reichenbach’s treatment of the logic for quan: 
tum mechanics dispenses with Reichenbach’s third truth-value and, in effect, treats 
elementary statements about definite position (or definite momentum) 38 presup- 
posing certain measurement conditions. Since the presuppositions must fail, according 
to the laws of modern quantum mechanics, elementary statements about definite 
position (or momentum) are “meaningless, ic., have no truth-value, 4 lathe Co- 
penhagen interpretation (Lambert, 19682). In both of these examples, the notion of 
[presupposition is treated as a semantical notion, and has received rigorous treatment 
in some positive free logics and in neutral free logics. 

In the supervaluational treatment of positive free logic (van Fraassen, 1991 [1968]), 
the Frege-Strawson notion of presupposition is analyzed as a metalogical relation: 


PRE,, A presupposes Bay Neither A nor ~A is true if Bis false 


This relation is distinct from the relation of logical implication and leads to no 
inconsistency in the van Fraassen~Bencivenga treatment of supervaluations because, 
though the law of exchided middle isa logical truth, the principle of bivalence that 
every statement is true oF false, does not hold. In Woodruff’s version of neutral free 
logic, which contains a truth operator in the object language, the Frege-Strawson 
notion is treated as an object language relation as follows: 





PRE, A presupposes Bay T(A) v F(A) > B 
where *-»" means “T(A) > T(B)’ (Woodruff, 1970, pp. 134-7). The choice between 
the two approaches depends on one’s preference for the non-truth-functional 
supervaluational approach versus the truth-functional Frege-inspired approach favored 
jn most semantical versions of neutral free logic. In either foundation for free logic, 
it follows that many, if not most, predications containing a singular term without 
‘existential import ~ for example, "Vulcan revolves around the sun’ ~ will have no 
truth-value because they presuppose a false statement like ‘Vulcan exists.” 


12.53. Partial functions and programming 


‘The most recent and flourishing nonphilosophical application of free logies has been 
in the treatment of partial functions rira-ris the development of programming 
languages and program verification. Lambert and van Fraassen (1972, pp. 209-210), 
noted that one of the obvious applications for free logic lay in the development 
Cf a natural theory of partial functions, functions which yield no values for some 
arguments. Later, Beeson (1985) utilized, in effect, a negative free logic for just this 
purpose; see also Troclstra and van Dalen (1988, esp. ch. 2, section 2). More 
recently, Feferman (1995) has used negative free logic for reasoning about expressions 
‘which may of may not have a value, especially in complex computational languages, 
and Farmer (1995) has used 2 negative free logic in the development of the pro- 
‘gramming language IMPS for use in reasoning about partial functions. Parnas (1995), 
likewise, has employed a negative free logic in the development of a technical 
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language for use in the description of software, while Gumb (1989, ch. 5), uses a 
positive free logic to express information about execution-time errors in programs, 

The issue in most of these cases concems the truth-value to be assigned to 
identities containing partial function names, for example, 


5/0=5/0 


where ‘/’ is the two:place partial function sign for division. This sentence is false in 
all negative free logics, and, indeed, that is the policy most often followed in free 
logical treatments of partial functions. Recently, however, Gumb and Lambert (1997) 
have argued that for certain programming languages this policy would be disastrous, 
for example, as in ALGOL60 and LISP. It is essential in such programming. lan- 
‘Buages that a sentence like 


Sif $=5 then 5 cle 5/0 


tur out true, and, hence, they have argued that, at least for certain programming 
languages, the underlying free logic must be positive, 


1254. Extensionality 


‘Turning, finally, to one of the most important philosophical applications of 
fice logic, itis necessary, frst, t0 say what it means for a language to be completely 
extensional in the sense of sara veritate substitution. A language is completely 
‘extensional if the truth-value of every statement composed of singular terms and/or 
predicates and/or statements is preserved when co-referential singular terms are 
substituted for each other, co-extensive predicates are substituted for each other, 
and co-valent statements are substituted for each other, in those statements. Every 
free logic fails this test, and hence is not completely extensional. In particular, what 
fails is the principle that co-extensive predicates always substitute for cach other 
salva veritate. This result follows from the fact that the principle of universal speci- 
fication, rejected in free logics, is logically equivalent to the principle 


CP ¥x(A= B) > (Als/x) = BUs/s) 





where “1” a singular term (name or definite description) (Lambert, 1974). Indeed, 
if is “Vulean’, ‘A’ is “x= x" and °B" is “Ely & x= x" (or “Ely D x= 2”) one obtains 
from CP 


‘Vulcan = Vulcan = (E!Vulean & Valean = Vulcan) 
(or EtVulcan 3 Valean = Vulcan) 





But no matter what truth-value the left-hand side of this biconditional has (including 
none at all), the right-hand side will have a different truth-value depending on the 
choice for *B°?" 
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‘This failure can have dramatic philosophical consequences. For example, 


(3) itis instrumental to the proof that the theory of predication in Quine’s (1960) 
Word and Objec a cornerstone of his theory of referential opacity, is unsound,” 
and 

(ii) itis essential to the proof that extensionality qua truth-value preservation and 
cextensionality qua truth-value dependence are not equivalent even in a predi- 
cate logic with singular terms having no existential import and only identity 
(Lambert, 19976). 


‘There are those who imagine that failures of various extensionalty principles, in the 
‘truth-preservation sense, have to do with the addition of operators like ‘it is possible 
that’ and ‘believes that” to the standard theory of general inference. But with the 
advent of free logics, there is another kind of addition that takes on special signifi- 
cance virt-vis extensionality. Just as the addition of modal operators to the standard 
theory of inference can threaten the substitutivity of co-referential singular terms, so 
can the addition of singular terms without existential import to the standard theory 
Of inference threaten the substitutivity of co-extensive predicates. 


Suggested further reading. 


An specially readable motivation for fre logic i the author's recent Free Lapis. Thir 
Faundations, Character, and Some Applications Thereof (Lambert, 19973), Another 6 
Bencivenga’s (1989) esay, “Why Free Logic!” in his hook Lane Ends Other useful introduc 
tions to the subject are Schock (1968), Lambert and van Fraassen (1972) and Bencivenga et 
al. (1991), More recent technical work can be found in variety of sources, published and 
forthcoming, including the proof theory in Lehmann’s (1994) paper and a forthcoming essay 
by Antoneli ented “Proto-semantics for Positive Free Logic™ and various essays in the 
forthcoming volume New Directions im Free Lagi, set of csays cited by E, Morscher 
Farther technical work can be found in the thre papers by Feferman (1995), Farmer (1995), 
and Parnas (1995), all i volume 43 of Erkewnenir; these concer the use of essentially 
negative free logic to provide a foundation for a natural theory of partial functions. More 
philosophically oriented work can be found in two recent essays by the author (Lambert 
1997b, 1998). Finally, the volume Phileophical Applications of Free Lagic (Lambert, 19914) 
contains an expecially wide aray of applications offre logics okt and new, fom the philosopy 
(of religion to the philosophy of mathematics. 





Notes 


1 Thus the property of existential import is a property of terms, not of quantifiers, For a 
‘more detailed discussion, sce Eaton (1931, pp. 223-6). 

2 The pivotal paper in the development of free logic is that of Leonard (1956). Other 
pioneering studies include Hintikka (1959); Leblanc and Hailperin (1959); Smiley (1960); 
Lambert (1963a, b; 1964); Schock (1964); van Fraassen (1966); Cocchiaella (1966); 
and Meyer and Lambert (1968), 

3 Not quite, however, because in the modern logic general terms of the form ‘is the same 
fas? must have existential import, where “is a singalar term. 
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For ‘natural deduction’ versions see Hintikka (1959), Schock (1968), and Lambert and 
‘van Fraassen (1972). For tableau (or tee) versions see Bencivenga et al (1991), and 
Lehmann (1994). 

‘This st of meta axioms, escntally an augmentation ofthe set in the purely quantiicational 
fragment of Lambert (19634) is due to Leblanc and Meyer (1970); Fine (1983) proved 
that MAA is independent of the other mcta-axioms in this set. 

"The fist such formulation is dve to Lambert (1967). 

Nevertheless, a version of NFL, paneling PFL, might be obtained by adding to PFL, 
the metaaxiom: A(a/x) > 3A, provided A(a/x) is atomic. Though incomplete, ais 
the purely quantifiational fragment of Lambert's (19632) system (see note § above), 
when appropriate meta-axioms for identity are added, asin the formulation of Burge 
below, completeness easily obtained 

‘This is essentially the system of Scales (1969, p. 11) minus his apparatus for complex 
predicates. The language of the provision in MAS#*, however, is closer to Burge (1991 
L974). 

‘There are model strctures similar to FM, model structures but in which fis a total 
function, that is (i) may be replaced by something like (i): f(a) is a member of D or 
f(a) = D. In this kind of development, singular terms without existential impor, forex: 
ample, ‘Vulcan,’ are assigned D, and hence, under the conventional rendition of identity, 
*Vulean © Vukan’ turns out true. Hence, this yields a positive free logic (Scott, 1991 
11967), 

“The reason the Scottish model described in note 9 is not a clasial model is this: 
though every singular term is assigned something, not every singular term is assigned 
something in the domain D. In fact, those singular terms without existential import are 
just those singular terms assigned to D itself 

‘The current statement of supervaluations follows very closely the account in Bencivenga 
ct al, (1991). There is a more lengthy statement of motivations underlying this kind of 
semantic development in Lambert (19972), Skyrms (1968) offered a way of augmenting, 
van Fraasien’s idea of supervaluations to make them sensitive to differences of structure 
in atomic statements. It too isa postive fice logic. In an informal note to Skyrms, David 
Kaplan showed thatthe evaluation rules were not recursively axiomatizable. 

See, for example, Bencivenga (1991 [1980)). Woodruff (1984) showed that neither 
‘compactness nor trong. completeness can be proved for PFLa. (with or without a primi: 
tive existence symbol) based on supervauations. 

LUchmann (1994, sections 2-4). AS Lehmann notes, his particular semantics was antici 
pated by Smiley (1960), though Smiley presented no proof theory. Lehmann’s paper 
contains the only published soundness and completeness proof for neutral fce logic 
‘Woodruff offered a proof theory, containing a truth operator, and semantics fora neutral 
free logic, but it tumed out be unsound (as Woodruff himself noted), see Woodruff 
(1970, esp. p. 142). There is alo an excellent discussion of the various semantical 
approaches for (positive and neutral) free logics permitting truth value gaps in Lehmann 
(1994). 1 am grateful to Lehmann for his advice in the adaptation of his original 
semantics. 

‘But not always. In the treatment by Meyer and Lambert (1968), the outer domain (the 
semantical domain) i stipulated tobe a set of expresions. The ontological picture is that 
of asct of nominaized second inteasions 4 la Goodman. So FM, there does not reflect 
the essential ingredient of a Mcinongian ontology because the objects (expressions) in 
the outer domain exist 

“This kind of model structure (and excntially the ensuing truth definition) was indepen- 
dently invented by Lambert and by Belnap, and presented in lectures by them during the 
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late 1950s. Lambert's inner domain-outer domain model structure (but not the ensuing 
truth definition) is reflected in Meyer and Lamber (1968). The most extensive and 
deuiled published treatment of the use of inner domain-outer domain semantics in 
{positive) free logic is by Leblanc and Thomason (1968). 

16 See Leblanc (1982, pp. 58-75). Independently, Gumb (1979) (in a postive free logic 
‘without function names) and Dwyer (1988) (in 4 positive free logic with function names) 
have demonstrated that Craig's interpolation lemma and Beth's definability theorem [see 
chapter 1] extend to positive fice logic. (The same pair of results extend to negative free 
logic if definitions are understood asin Schock (1968). I owe this observation to Ray 
Gummb.) 

17 See the formulation, minus the dition of the complex predicate forming operator ‘A,’ 
in Scales (1969, pp. 11-17) 

18 Seales’ proof (1969, p. 124) is indirect, proceeding via an equivalent formulation in 
terms of tableau rules. To accommedate NFL. it sufices to drop the tableau rales 
involving the complex predicate operator ‘X" in Scales’ proof 

19 The only exceptions may be a system proposed by Robinson (1979), and the system in 
Stenlund (1973). Most, if noe all, free definite description theories take as their under- 
Iving foundation either positive or negative free logic (with idemtty), The fist sound 
and complete fee definite description theory was proposed by Lambert, (19636, 1964). 
{t was in the frst ofthis par of papers that the basic principle offre definite description 
theories, MED, was proposed; its now called ‘Lamben’s Law’ among free logicians. OF 
free definite dexription theonies that have been propose and/or further studied, the 
mote prominent are thoxc in Bencivenga (1980), Burge (1991 (1974)), Grandy (1991 
[1972)), Scales (1969), Schock (1968), Scott (1991 [1967]), and van Fraasen and 
Lambert (1967). 

20. See Lambert (19974, chs 6 and 7) for a filler account of free theories of definite 
‘descriptions and the problem ofthe hicrarchy of postne fice definite description theories 

21 Inthe case of Lchmana’s semantics, the counter-example to CP requires different instances 
‘of both “A" and “8, namely, “Els” and “E!x > Ex respectively. 

22. ‘The proof in question is in Lambert (1998). In this same paper, it is argued that if a 
predicate forming operator is added to the language of negative free logic, a in Seales 
(1969), and predications taken a just a subset of the atomic sentences, it may be 
pomible to save a CP-tke principe atleast in negative free logics Fora realization of this 
Possibility in positive fee logic, see Lambert and Bencivenga (1986). 
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Chapter 13 


Relevant Logics 
Edwin D. Mares and Robert K. Meyer 


‘Once upon a time, modal logic was castigated because it shad no semantics.” Kripke, 
Hintikka, Kanger, and others changed all that. In a similar way, when Relevant 
Logic was introduced by Anderson and Beinap, it too was castigated for ‘having no 
semantics.’ Then Routley and Meyer (1982a [1973]) changed all that, along, with 
Urquhart (1992a [1972]), Fine (1992b (1974]), and others. The present overview 
‘marks a culmination of that effort. The semantic approach described here brings 
together a number of hitherto disparate efforts to set out formal systems for logics 
Of relevant implication and entailment. It also makes clear (despite some of our hopes 
and utterances) that the One True Logic does not exist. This is as truc for relevant 
logics as Kripke et al., showed it to be for modal logics. In both cases, subtle (and 
not s0 subtle) variations on semantical postulates produce different logics in the 
same family, The question of which semantical postulates are correct makes no sense 
without further contest, i¢., the questioner needs to answer the question: Correct 
for what? The question that does remain is: What motivates the relevant family of 
logics? And this is the question that is the main job for this chapter to investigate. 


13.1. A Little History 


Entailment, one would think, is a relation. It is the relation that holds between the 
‘premises of a valid argument and its conclusion. Yet modern symbolic logic, which at 
least since DeMorgan and Peirce has prided itsef on taking relations seriously, failed 
to do so with respect to the central notion of lagical consequence that is its business 
to analyze. Here and later we shall insist on the essentially relational character of a 
good implication. 

‘The modern history of relevant logics begins at the same point as the history of 
‘modal logics — namely, with the disquiet over the thought that the material 2 is a 
decent implication.’ With the ink scarcely dry on the first edition of Whitehead and 
Russell's Principia Mathematica (1910-13), C. 1. Lewis (1918) was already in print, 
decrying the paradoxes of ‘implication.’ The chief ones say (in English), 
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PA false proposition implies anything. 
P+ Anything implies a true proposition 
PL and P+ reflect the well-known truth table for >, which looks like 


D {FT 

FyTtT 

TieT 
Now what, honestly, would induce a sane human being to suppose that this table 
captures ‘implies?? (Or even, as has sometimes been urged instead, “if... chen’?) 
Until they have been brainwashed with sophistres, elementary logic students grasp 
the point at once. This table is silly. No, Bertie, “France is in Australia’ does not 
imply “The sea is sweet’. And no, Van, it is equally false to say “If Brance is in 
Australia then the sea is sweet’. 

To be sure, Logic isthe science of argument, and like any other science, Logic has 
«right to simplifying assumptions and a formalism of its own. But it also has the 
obligation to enrich that formalism, the better to separate good arguments from 
bad. 

Lewis saw it that way, introducing several systems of strict implication to over 
come the deficiencies of P~ and P+, and, a shall be seen alittle later, Lewis’ original 
rejection of material implication is based on ideas very close to those of relevant 
logicians. Beginning with negation ~ and a binary consistency operator «, Lewis 
defined strict implication (our —+), via the rubric 





D+ A+Bey~(Ae~B) 


‘That is, A (strictly) implies B just in case A is inconsistent with the oegation of B. 
‘The task of formalizing a good theory of implication then becomes one of finding 
the right postulates for the binary possibility operator. 

Jn a certain sense, of course, Lewis believed that P- and P+ are true. He agreed 
that 


cr ~AD(ADB) 
cre AD(BD A) 


are logical truths. But he also held that “[tJhe relation A> Bin this calculus has not 
‘quite the usual meaning of ‘A implies B,” due to the fact that relations of the system 
are those of extension” (Lewis and Langford, 1959, p. 85). Lewis had no objection 
to CP- and CP+ as material logical truths. He had no qualm with accepting 
the material hook (or horseshoe) as a legitimate connective. But he objected to 
the identification of the hook with implication. Pleasantly (as Lewis and his later 
co-author Langford saw it), CP- and CP+ are not theorem schemes of any of 
the systems of theirs (1959), when formulated with strict implication replacing 
material 3. 
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Less welcome to many later logicians were the paradoxes of strict implication, 
which Lewis and Langford (1959) considered ineluctable. These say, again in English, 


SP An impossible proposition implies anything. 
SP+ Anything implies a necessary truth. 

Paradigmatic formal counterparts for Lewis of SP- and SP+ were the following: 
XP (An~A)>B 
XP+ A+ (BY ~B) 


So important were XP- and XP+ to Lewis that he and Langford gave ‘independent 
Arguments’ for them. Here isa version for XP-based on that of Lewis and Langford 
(1959, p. 250): 


1 An~A Hypothesis 
2A 1, ABlim 

3 5A 1, ABtim 

4 AVB 2, vinero: 

5B 3, 4, Disiunctive Syllogism 
6 (An~A)>B 1-5, Intro 


‘The argument is simple, perhaps even familiar, but is it any good? Is each line really 
centailed by its premises? It seems that in his 1917 article “The Issues Concerning 
‘Material Implication” Lewis had already seen why it is fallacious. There he sets out 
dialogue between two characters: X and himself (L), Here isa relevant part of that 
dialogue (Lewis, 1917, p. 385) 


1. Bat tll me: do you admit that “Socrates wat a solar myth” materially implies 
24285? 

X. Yes; bu only because Socrates was nora solar myth 

1. Quite so. But if Socrates were a solar myth, would it be true that 242 =5? If 
you granted some paradouer his assumption that Socrates was a solar myth, 
would you fel constrained to go on and grant that 2+2=5? 

X. —Lsuppose you mean to harp on “irelevant™ some more. 


In his and Langford’s ‘independent argument’ for XP-, they do not “grant the 
paradoxer his assumption” that a contradiction holds. What is needed is a way to 
deal non-trivally with impossible assumptions, like contradictions or Socrates’ being 
4 solar myth, Section 13.4 returns to the treatment of impossibilties in relevant 
logic. The next section 13.2 turns to another aspect of the relevant analysis of the 
paradoxes. 
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13.2, Variable Sharing 


What in general is wrong with the paradoxes of implication? It would seem that, in 
cach, there is an insufficient tie between antecedent and consequent or premise and 
conclusion, As Lewis says in the dialogue given above, there is a lack of relenance 
here. 

Relevant logics ensure that logically truc implications do not have antecedents 
that are completely irrelevant to their consequents. As Ackermann (1956), the father 
of the theory of relevant entailment, wrote, there should be a connection between 
the content of the antecedent and the content of the consequent, This connection 
might seem difficult to enforce, For content is a semantic notion. The notion of a 
Jogic, on the other hand, is usually taken to be a syntactic concept, specified cither 
in terms of a set of valid proofs or of a set of theorems. The gap between the 
semantic and the syntactic is bridged in part, however, by the rariable sharing con- 
straint. A logic, L, satisfies the variable sharing constraint iff (if and only if) when. 
ever A> Bis a theorem of L, A and B share at least one propositional variable.’ 
‘The variable sharing constraint forces the antecedent and consequent to share some 
content, for then they are, in part, both about a least one or two or more propositions, 
‘Thus, they cannot be absolutely semantically irrelevant to one another. 

‘A form of the variable sharing constraint was discovered early in the development 
‘of modern logic. Russell's book, The Principles of Mathematics, begins in its fist 
chapter with a version of variable sharing (1903, p. 3): 


‘Pure mathematics is the class ofall propositions of the form “p implies 4." where p and 
{are propositions containing one or more variables, the same in the two propositions, 
and neither p nor ¢ contains any constants except logical constants* 


It is not remarkable that Russell demands that all statements of mathematics be 
implications, since it is well-known that Russell (following Peano) believed that 
statements of mathematics are formal implications. What is interesting, however, is 
that Russell demands that the two propositions in an implication of mathematics, 
contain exactly the same variables. The variables discussed here are not usually 
propositional variables, as they are in the relevant Jogicians’ variable sharing, con: 
straint, but Russell does seem to desire that the formal implications of mathematics, 
connect propositions that have content in common. These propositions are sup- 
posed to be about the same things (Russell, 1903, section 5). 

‘The variable sharing constraint by itself, however, docs not yield an analysis of 
relevance. Although it is a necessary condition for a logic to be a relevant logic, it is 
‘ot sufficient. For suppose that one merely accepted all of the theorems of classical 
logic that satisfied this constraint. One would then still be left with 


ea) 
paar 


as well as many other paradoxes. 
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13.3. The Deduction Theorem 


Relevant logics are supposed to capture the relation of entailment or that of 
implication between propositions. The philosophical notion of entailment was first 
developed by G. E. Moore in approximately 1920. He defines ‘p entails q’ to mean 
“g is deducible from p’ (Moore, 1922, p. 291). Thus, where ‘+” represents the 
relation of deducibilty, the following captures the logics of entailment: 


If Ab B, then it is a theorem of the logic that A entails B. 


Since implication is logically weaker than entailment, this relationship should also 
hold for implication, The above condition is known as the single premise deduction 
theorem. 

‘To understand the importance of the deduction theorem in this context, a little 
more needs to be said about the notion of relevant deducibility. One standard 
condition on deducibilty relations is that, if a proposition is a premise then it can 
also be a conclusion, For example, classical logicians and intuitionists take 


(PP) pate 


to be a valid deduction. Bur relevant logicians do not. For, consider the fall deduction 
theorem: 


MWA. ..y Ay APB then Ay... AGFA SB 





fone were to accept (PP), then, by two applications of the deduction theorem, one 
would have to accept 


bp(ap) 


‘This says that p+ (q+ p) is a theorem, But itis a paradox of implication, and itis 
‘not wanted. So one cannot accept this standard condition on deducibilt. 
Instead, relevant logicians have developed a notion of deduction due to Moh 
Shaw-Kwei (1950) and Church (1951). On this conception of deducibility, Ay... , 
‘A, Bis relevantly valid only if A,,... A, may ail be really used in the deduction 
of B. In (PP), qis not used in the deduction of p, hence relevant logicians claim that 
(PP) is not a valid inference * 

‘The requirement that itis pesble to use all premises in a relevant deduction needs 
itself to be fleshed out. One means, in natural deduction systems, isthe method of 
‘relevance indices.’ Such systems are not deait with here,* but the main point can be 
put briefly. Hypotheses in a proof are tagged, and other steps are indexed by the 
tags on the hypotheses that are used to produce them. For a tagged hypothesis A to 
be discharged in a conditional sub-proof, the conclusion C of that sub-proof must 
‘bear (perhaps among others) the tag on the hypothesis A; if it does, A— C is 
inferred by discharging A, ending the sub-proof. After this application of the —> 1 
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rule, the tag on the discharged hypothesis is removed from the indices on A-> C, 
which inherits its remaining tags, if any, from the conclusion C of the conditional 
proof. If a tag on a hypothesis does not appear in the steps on which the condi- 
tional proof is based, then the hypothesis cannot be discharged in that proof. Thus 
only hypotheses that are really used can be discharged. 

In sequent systems for relevant logics, the real use requirement is enforced by 
treating premises and their relation to conclusions in a very intensional manner. A 
sequent (or consccution) is a structure of the form A,,..., A, B. And a sequent 
calculus (also known as a ‘consecution calculus’ or “Gentzen system) is a logic for 
inferring sequents from sequents. Consider the classically and intuitionistically valid 
inference on sequents from (13.1) to (13.2) 





BRC (a3) 
A, BEC (32) 


‘This inference - the so-called ‘weakening’ rule ~ obviously allows one to add 
arbitrary premises, even those that cannot be used in any intuitive sense. True, there 
isa way to concede that classical or intuitionist logicians who appeal to the weak- 
ening rule know what they are talking about. For they interpret the structural '," 
4s extensional conjunction ‘’. Following Dunn (197Sb [1973]) and Mints 
(1976), another structural connective is introduced, *:', to do this job, So the 
standard logician does have a good argument to justify weakening. But it applies 
to *, and not to *,. So from (13.1), one can justifiably conclude, not (13.2), 
but 


A: BEG (3.3) 
‘On the Dunn-Mints plan, this leads immediately to 
AABEC 34) 


Relevant logicians deny that (13.2) and (13.3) are equivalent. That is, they deny 
that the premises in a relevantly valid argument are conjoined to one another 
with standard conjunction. Rather, they think an inference A,,..., A, Bis equiva- 
lent to +A; 9(...(A,+B)...). Moreover, the latter is not equivalent to 
F (Ay a+ +A.A,) 9 Bi relevant logics. Instead of taking premises in an inference 
as bound together by standard extensional conjunction (i.e. whose truth conditions 
are determined by a truth-table), relevant logicians have introduced another form 
of conjunction. This is the intensional conjunction that was briefly introduced in 
section 13.1. It is called fusion, written ‘', and goes back (at least) to Church 
(1951). The central requirement of fusion is that it obey residuasion: 


Ae BECIBAL BOC 


In short, fusion needs to satisfy the deduction theorem.” 
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‘To sum up what has been said so far, relevant logics should satisfy three conditions: 


1 They should avoid the paradoxes of implication and, in particular, give a way of 
dealing with contradictions and other impossibilities non-trvially, 

2. They should satisfy the variable sharing constraint. 

3. They should contain a deducibility relation that requires all premises in a valid 
deduction to be capable of being used in that deduction and they should satisfy 
a deduction theorem. 


13.4. The Ubiquity of Inconsistency 


Relevant logics avoid XP-. This makes them paraconsistent logics. A paraconsistent 
logic is a logic that somehow tolerates contradictions. This is a very good feature, 

For contradictions are everywhere. Sad though this fact may be for any pursuit of 
rationality, candor compels its admission. Here are some of the spots at which 
inconsistencies break through: 


© Natural sience A theory is said to be ‘in difficulties’ when it conflicts with the 
results of observation or with another well-accepted theory. Combined with 
classical physics, Bohi’s early theory of the atom predicts that electrons would 
radiate energy and fill into nuclei, and that they would not. 

© Foundations of Mathematics From infinitesimal analysis through the summation 
of infinite series to the contradictions of set theory, mathematics too has ever 
been ‘in difficulties.” 

+ Bad data A recent census reported one million more married women than 
rmartied men in the USA. This is unlikely. 

+ Metaphysics 1s not Zeno’s arrow always both at rest and in motion? 

© Theolegy God is three. God is one. Is He off by two? 


‘This is not to suggest that the contradictions in all (or even any) of these cases is 
ineluctable. Great efforts have been made to resolve (or at worst live with) the 
associated problems. But no one takes XP- above at face value, to deduce whatever 
they want (and don’t want) from present mistakes (if mistakes they be). 

Rather, as Belnap has urged, people reason around any inconsistency in their 
present beliefs. This section now tries to find some theoretical ground on which to 
do so. Building on algebraic and semantical work by Bialnicki-Birula and Rasiowa 
(1975), Dunn (1975a [1966]) and others, Routley and Routley (1972) introduced 
4 unary operation on what are called ‘worlds’ or, more soberly, ‘theories’, “set-ups? 
fr ‘situations’. Where a is a world, Routley and Routley postulate a companion 
world a* such that 


T~ NA, a)= ere iff WA, a") = fale 


where 1 is an interpretation in a model. Two ways to understand the ‘Routley star 
‘operator’ are discussed in section 13.6 below. Given this formalism, however, it is 
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immediately clear how Routley and Routley (renamed Sylvan and Plumwood) provide 
a Semantic refutation of XP-. Here (p 4 ~p) —> qs refuted. Fixing a, set Kp, a) = srue 
and Hip, a*) = Kg, a) = fal. Applying T~ this makes the premise (of XP-) true at 
but its conclusion false at a. And a purported implication that fils to preserve truth, 
is surely no just candidate to ground a theory of logical consequence." 





13.5. Model Theoretic Semantics 


Section 134 introduced worlds and the Routley star operator. Now it is time to 
present a semantics for relevant logic in almost ll its glory. Some its finer technical 
details are omitted, and discussion is limited to its philosophically more interesting 
aspects. 

Like Kripke's semantics for modal logics [see chapter 7], the semantics for rel 
evant logic is a world: based semantics. Our frames start with a set of worlds, K. OF 
these worlds, some are distinguished, and called N (‘normal worlds’). A formula is 
valid in a frame iff tis true on all normal worlds on all interpretations. Like Kripke, 
there is also an accessibility relation, R, between worlds. His accessibility relation, 
however, is meant to deal with a unary operator, necessity. Instead, R deals with a 
binary relation, implication. According to Jonsson and Tarski (1951), it makes 
formal sense to treat a unary connective by means of a binary relation and to treat a 
binary connective using a ternary relation. 

‘The truth condition for relevant implication is 


TRA Ba) = true iff for all 6, ¢ such that Rabe, if (A, 6) = true, then 
NB, ©) = true. 


Notice how this truth condition allows one to avoid, ¢., the paradoxical A —+ (B+ B) 
since it docs not force B=» B to hold at all worlds 

‘To handle negation, there is the Routley star operator, introduced above. The 
‘ruth conditions for extensional conjunction and disjunction are straightforward: 


TA KA B,a)=trueiiff KA, a)= true and IB, a) = true. 
Ty KAv B,a)= true iff KA, a) = true or MB, a)= true. 





Intensional conjunction, or fusion, discussed above, has a more technical condition: 


Te RA®B, a)= true iff there are some worlds b, c, such that Roca and 
KA, b)= trwe and KB, 6) = trwe. 


This condition looks a bit forbidding, but it can be understood if one follows Lewis 
in thinking of fusion as a type of binary relative possibility connective. KA B, a) = 
‘true says that, A and B are jointly possible at a in the sense that a recognizes the 
combination of worlds in which A and B obtain. 
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‘The relationship between non-normal worlds and normal worlds in the Routley— 
Meyer — henceforth relational ~ semantics is also interesting and important. Define 
a relation on worlds = such that 


4S biff there is some world m in N such that Rnab 


and postulate that = is reflexive and transitive. Also, place certain constraints on 
frames and on interpretations so that hereditariness holds, namely, 


If (A, a) = true and a= b, then KA, b)= ere. 


It is traditional in logic to identify an entailment relation on sentences with the truth 
of ~ statements at one or more points. This tradition is reflected in the relational 
‘semantics via the semantical entailment fact below. Given an interpretation J, it is 
said that 


A entails B if, for all worlds a, if RA, a) = true then IB, a) = erwe. 


Entailment thus being (as usual) truth:preservation over all worlds, the matching, 
true ~> statements are those truc at all normal worlds m in N. That is, given J, 


A normaly implies Biff, for all m in. N, HA —+ B, ») = true 


‘And now by hereditariness and reflexivity of =, this important semantical entailment 
{fact obtains, for every frame and every interpretation J therein 


Fact (SemEnt): A entails Bit A normaly implies B. 


‘The proof is simple, and so is left to the reader. 

‘This fact simplifies soundness proofs for relevant logics. Suppose that one wants 
to verify a theorem of the form ‘A> B’. Assuming that at an arbitrary world a, A, 
18) true, one then shows I\B, «)= trae. Applying SemEnt, A> B is true on every 
normal world in the model, Generalizing, A -» B is truc on every normal world in 
‘every model structure, hence itis valid. The reader should keep this use of SemEnt 
in mind when the various semantical postulates that are placed on models in coming, 
sections are discussed. This clarifies greatly the relationship between the postulates 
and their corresponding axioms. 





13.6. Interpreting the Semantics 


How is one to understand the various features of the semantics? We do not think 
that there is a single right answer to this question. Different relevant logics, we 
think, formalize different notions of entailment or implication. For these, diferent 
interpretations of their corresponding semantics are appropriate. 
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Consider the relation R. One interpretation, due to Barwise (1993) and developed 
by Restall (1996), takes worlds to be ‘sites and channels.” A channel transmits 
information from site to site. In addition, channels can also be sites and sites can be 
channels. Where a, & and ¢ are sites, they read Rabe as saying that a is a channel 
between & and «, and thus KB C, a)= erue as saying that all pairs of sites b, € 
connected by channel a are such that if B is information available in b, then C is 
information available in ¢, For example, for to sites connected by a telephone wire 
(the channel), what one person says in one site causes a person in the second site to 
hear certain sounds. 

‘Mares (1996) presents another interpretation that adapts Israel and Perry's (1990) 
theory of information to the relational semantics. On this, the works in frames are 
situations, in the sense of Barwise and Perry's situation semantics (1983). [See 
chapter 20.| Situations contain information. A piece of information ~ an infom, to 
‘use Devlin’s (1991) term - might be about the physical things in the situation, or it 
‘might be about connections between other infons. In particular, an infon might be 
about what information other infons carry. For example, an infon might carry the 
Information that a ced light showing on a stove carries the information that the oven 
is on. These infons that present information about connections between other infons, 
can be called informational links. The accessibility relation R represents the links in 
situations. If there is a link in a situation a that says that an infon 6 carries the 
information that the infon x aso holds, then if Rabe and # contains the infon 6, 
then ¢ contains the info x. Links are not only included among the information in 
a situation, but also impose closure constraints on the sct of infons in the situation, 
For example, if the ‘law of nature 


All bodies attract one another 


is alink in # and i and j are bodies in a, then i and jattract one another in a. Thus, 
‘00 the link-interpretation, add the following postulate to the definition of a fame: 


Raga 


which says that every situation is closed under informational links. And note that the 
link interpretation demands this closure. On the channel theoretic interpretation, on 
the other hand, this is an unnatural postulate, for not every channel carries informa- 
tion from itself to itself. 

‘On to the other aspects of the semantics: As has been seen, the relational semantics 
divides worlds into normal and non-normal worlds. For some logics, the normal 
‘worlds (those at which we verify theorems) can be interpreted as possible worlds in 
the metaphysicians’ sense. For these logics, the normal worlds can be taken to be 
complete (ie. to satisfy the principle of excluded middle) and to be consistent. But 
‘not all relevant logics are characterized by a class of frames that have these proper- 
ties. Only those logics which have excluded middle as a theorem and for which the 
rule gamma (1) is admissible (see section 13.9) have a model theory of this sort. 

‘The star operator has been controversial, but it can be given various reasonable 
interpretations. First, start with a linguistic interpretation from Meyer and Martin 
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(1986). The underlying idea here is that there is a distinction between what one 
actually averts and what one fails to deny (thus what one weably asserts). Here are a 
couple of sentences of recent interest: 


P: In December 1999, NASA listened in vain for signals from the Mars polar 
lander. 


Q Martians interfered in 199 with the transmission from their polar region. 


All of us, probably, will agree with P. But what of Q? Only supermarket tabloids are 
likely actually to assert it, perhaps citing P as evidence. But one might, if one pleases, 
weakly asert Q, lacking evidence to support its denial. On this interpretation, a* 
comprises the sentences weakly asserted at a. 

Another interpretation, due to Dunn (1993), suggests one thinks of two workds 
as containing compatible or incompatible information. Suppose that a says that a 
particular table is round and according to # that table is square, Then a and b can be 
said to be incompatible with one another. On the other hand, if there are no such 
conflicts, then the two worlds are compatible. In the language of our formal semantics 
Cab says that # and & are compatible. The truth condition for negation can be 
cexplicated in terms of compatibility alone, namely, 


C~ ~A, a) = true iff for all b such that Cab, (A, 6) = false, 


In other words, ~A is true at a if A's being truc is somehow incompatible with the 
other information contained in a. Worlds can be incompatible with themselves; any 
inconsistent world is. The star operator can now be understood in terms of com> 
patibility, for a* can be taken to be the maximal world such that a is compatible 
‘with it. That is, for any world a, a* is the world such that 





(i) Caa* and 
(ii) for any world 6, if Cab, then 6 a*. 


(Of course, this definition assumes that, for any world, there always is a maximal 
‘world compatible with it.” 


13.7. Some Main Systems of Relevant Logic 





This section presents some of the central systems of relevant logic. It does not 
present all the systems that people have proposed or on which important work has 
‘been done. Rather, it looks at only enough to give the reader the flavor of the 
systems and an idea of the range of relevant logics. 

From the outset here, a relevant —> has been considered exentially relational. 
Before going into the details of specific systems, one might pause to think about 
relations. The most famous ones are binary (2-place), e.g., brother, sister, parent, 
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child ~ not to mention =, €, <. Even modal logics have a philosophically motivated 
(Kripke) binary relational semantics [see chapter 7]. Yet the key to the universe, 
discussed below, involves 2 step up, at least to ternary (3-place) relations. (There are 
more than a few of these as wel, .g., between, sum, product, jealous.) Consider now 
relational composition. Use *B", “P*, “U" respectively for brother, parent, and uncle 
‘Then xis the uncle of y iff 3e(ee AP). The notation for this will set 


Usa Pibsiy 
‘hich one can abbreviate, on the obvious convention, to 
Usymy Phiey 


‘Thus, the result of composing two binary relations is another binary relation. Things 
become more interesting when composing general mary relations. For one thing, 
composing two 3-place relations yields a 4-place one (and so on, pushing mas 
high as one likes). For another thing, the order in which 3-place relations are com 
posed definitely matters. For it is important to distinguish 3x( Rad » Reed) from 
x( Raxd » Rhex). This can be done by extending the conventions just introduced in 
the 2-place case and writing, 


Rabed = R(Rad)ed = 3x Rabe» Rxed) 
Ra be)d =y Ra Rhe)d =y 3x( Raxd » Rbex) 


(thus, employing again the device of associating the composed relations to the left, 
inserting explicit parentheses otherwise. The iterated occurrences of R can also be 
dropped, which 


(2). reduces visual clutter and 
(b) clarifies connections between candidate logical axioms and matching combinatory 
~ the key tothe universe, discussed below.) 


Now for some systems: Start with the logic B. This system (or atleast its positive 
part B+) may be taken as a base system in much the same sense as the logic K is 
taken to be the base normal modal logic. The language used, and that has been 
assumed throughout this chapter, includes the unary connective ~ (for negation), 
the binary connectives » (extensional conjunction) and —> (implication or entailment). 
Extensional disjunction, v, is another primitive for B+; otherwise it is defined, along 
with €9, as usual: 


AV Bay ~(~An~B) 
AG Bay (A> B)A(B> A) 


‘The axiom schemes and rules for B are as follows: 
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1 ASA 
(AaB) >A 

(AnB)>B 

(A> B)A(A> C)) 4 (A> (Ba) 
(AC) a(B> C)) 4 ((Av B) >) 
A+ (AvB) 

B>(AvB) 

(Aa (By C)) 9 (AA B)v(AaC)) 
HAGA 


eevoneun 


FASB 
bA 

FB 

FA 

EB 

FAAB 

BOR 

ba A 

FAS hao) 
FAs ~B 


Modus Ponens 





Adjunction 





FRA 


‘To form the positive fragment B+ of B, chop axiom 9 and the contraposition rule. 
‘To these systems, relevant logicians add various axiom schemes. For example, the 
logic R results from adding to B the schemes: 


10 (A+B) (B-4C)-+(A+ OC) (Suffxing) 

No (A+(A4B) (4B) (Contraction) 

12 (A4(B4C))4(B4(A4C)) — (Permuzation) 

13 (A+~B)9(B>~A) (Contrapesition) 
4 (A+~A)>~A (Reductio) 


With the addition of these schemes, there is no need forthe affixing and contraposition 
ruiles ~ they can be derived. (Adding these axiom schemes to B+ yields the positive 
fragment R+ of R, and similarly for the other logics mentioned below.) 

E results from adding to B suffixing, contraction, contraposition, reductio and 
another axiom, ¢.g., 





292 





Relevant Logics 
15 (A> A)A(B>B) +O) C 


EE was supposed to formalize the notion of entailment ~ ‘the converse of deducibilty.” 
Entailment was motivated as both relevant and necessary. R was supposed to be the 
‘demodalized,” but nonetheless relevant version of B. The thought was that R has 
approximately the same relationship to E that classical propositional logic has to S4. 
A natural hope was that by adding an S4-like necessity, R could be extended to a 
system NR that would prove equivalent to B, parsing the entailment of E as strict 
relevant implication (Routley and Meyer, 1982b [1972]). Although Kripke and 
lothers confirmed this for some fragments of B, the project unforunately collapsed 
when Maksimova (1973) exhibited a non-theorem of E which is nonetheless provable 
on NR translation. 

‘The system T of ‘ticket entailment” results from adding to B suffixing, contraction, 
and prefixing, which is 


16 (B+C)4((A+B) (AC) 


as well as contraposition (axiom 13) and reductio (axiom 14), Here an arrow 
formula is taken as an inference ticket, “A> B”, saying that the inference from A to 
Bis justified (Anderson and Belnap, 1975, section 6) 

Using the abbreviations introduced above, table 13.1 presents some correspon- 
ddences between propositional theses and semantic theses in the relational semantics. 
Also listed are the names of aseciated combinators, whose significance becomes clear 
in section 13.8 











Table 13.1 
Combinator Thess name Tris Semantic poalate 
B Prefixing, (B4C)4(A+B)+(A+C))— Rabed = Raed 
B(=CB)—Sulixing (A+B) (B40) -4(A-4C)) Rabel => Racid 
w Contraction (A+(A+B) (AB) Rabe= Rabbe 
c Permutation (A-+(B-+ ©) -+(B-4(A-+C)) Rabel => Rechd 
C= C1) Assertion A> (A+B) 9B) Rabe = Riac 
K Weakening) A-+(B-+ A) Raba 
K(=KI) Weakening? A+ (8> B) Rabb 
Double negation —~A=+ A and A+ ~~A ane 
Contraposition (A+ ~B) +(B-+~A) Rabe Rac 





13.8. Combinators: Connecting Proof Theory to Semantics 


This section presents the mathematical motivations behind the various semantical 
postulates. Clearly, the immediate mathematical motivation in each case is that the 
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postulate works - it does define a class of frames which characterizes the correspon- 
dling logical system. But, there is much more to it than this. There is an elegant 
relationship between the conditions on relational frames and the branches of math- 
ematics called ‘combinatory logic’ (CL) and ‘lambda calculus’ (LC)."" 
‘Combinatory logic was devised by Curry in approximately 1930 as a very general 
\way to represent and study operators and combinations of operators."' For example, 
from (Hindley and Seldin, 1986, p. 20), consider the arithmetic operation of addi- 
tion, Addition is commutative, ic., x+y= y+. Let the addition function be repre- 
sented by +. Then +(x,9)=4(y,x). Adopting the usual conventions of CL, all 
functions may be treated as I-place and parentheses and commas dropped while 
associating, to the left: +xy=+tyx. Now introduce an operator € such that, for any 
function f, Gfey= fix. Then it can be said of the addition operator that += Cr. 
Combinatory logic studies operators, called combinators, that, like C, describe the 
bbchavior of functions, It begins with a small stock of combinators, and defines other 
‘combinators from them. For example, there is another combinator that describes 
how functions are composed. This is the combinator B, and it obeys the equation: 





Bf ax= figs) 


where f and g are functions. B says that the result of applying the composition of 
two functions to an object is the same as the result of the application of the first 
function to the result of applying the second function to the object. 

Instead of using special notation for functions, combinatory logic attempts to be 
perfectly general, not distinguishing notationally between functions and other enti 
ties, To this end, ‘Baye >x(2)° will be written to represent the above reduction 
rule? 

In addition to the combinators themselves, one needs to understand the ype 
schemes of combinaton.”” A type scheme will be interpreted as a schematic formula, 
In a *Curry type,” the only pieces of logical notation are type variables f, etc., and 
formulas A, etc, built out of these variables by implication > (and parentheses). For 
‘example, the scheme A + Bis the type of a combinator which takes entities of type 
‘A to entities of type B. An easy combinator to type is the identity combinator, 1, 
whose reduction rule is Zx> x. Clearly, it takes entities of any type A and returns an 
entity of the same type. So its type scheme is A+ A. The type of Kis also easy to 
‘understand. Its reduction rule is Kxy> x. (Kx) applied to any entity returns x. So, 
it is function from a thing of type A to a function from any entity to a thing of 
type A. Consider K3, where 3 is the third positive integer. This is a constant 
function, which returns the value 3 for any argument of any type. So K itself is of 
the type A ~» (B+ A). Table 13.2 lists some combinators with their reduction rules 
and principle type schemes. 

‘A logic can be thought of in terms of a set of combinators. The relevant logic R, 
for example, can be thought of as B, C, 1, W logic because it contains as axiom 
schemes the type schemes of these combinators. In addition, R contains as theorems 
all the type schemes for the combinations of these combinators. For instance, the 
type scheme of Cis A> (A> B)—> B), which is a theorem of R. 
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Table 13.2 
Combinator (Name) Reduction rule Typeschome 

T(ldentity) b> AsA 

2B (Composition) Boz>sie) (B40) 4 (A>) 4430) 

e Boe>y2} (A+B (B40) (AC), 

c Coe> say (A3(830) 4 (84(440) 

W (Diagonaliation) Way > 09 (As(A5 8) (498) 

K (Constant function) -Ksy> A+(BA) 

8 (Strong composition) Sige> se") (A-¥(B4C) (A+B) (AC) 





‘To understand the relationship between combinators and the relational semantics, 
the notion of a sbeory is needed. The language used is a fragment of propositional 
language. For now, only formulas containing propositional variables, parentheses, 
extensional conjunction and relevant implication are considered, Then a theory of a 
logic Lis defined to be a set of formulas X such that 





(i) (adjunction) if A is in X and Bis in X, then A Bis in X and 
(ii) (entaitment) if A» Bis in L then, if A is in X, then Bis also in X. 


‘Then a model may be created out of the set of theories of I. The ternary 
accessibility of relation on this set will be defined later. Now, introduce a fusion 
‘eperator on theories, ¢ (This is the same symbol as was used for intensional conjunc 
tion in the syntax; that will be explained later.) Fusion is defined on theories as 


If X and Y are theories of L, then X* Yay |B 3A(A > BE X& AE Y)) 


Ie-can be shown that for any relevant logic that contains the implication-conjunetion 
fragment of B, the fusion of two theories is also a theory." The properties that 
fusion has in a structure of this sort depend on the choice of relevant logic for L, but 
here is one general property of fusion that will be needed later: 


Fusion Fact If XC Y, then X° WC Ts W. 


(The proof of this fact is easy and is lft to the reader.) 

‘One way of ooking a the variations berween the structure of theories of different 
logics is by investigating the combinators under which they are closed. To explain: 
take the combinator C. For ease of expression omit the fusion operator and merely 
write ‘ay for the fusion of x and x; and associate to the let as above. Applied to 
theories, the combinator equation for C says that Coz = azy. Closure of a structure 
of theories under C means that for any theories x, y and = of L, y= x29. 

‘Now for the fun part: Recall thatthe principle type scheme for Cis the permuta: 
tion scheme (A+ (B+ C))>(B-+(A— C)). One can prove that the theory 
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structure for L is closed under C iff all instances of that type scheme are theorems of 
L, This might seem to be an amazing coincidence, but it is not merely a coinci- 
dence. It is, as promised, the key to the universe. The same correspondence holds for 
all the combinators listed above. 

There’s more! Curry types (near enough, pure —> formulas) were assigned to 
combinators above. Types become more interesting if « along with ~» are thrown 
in, This fact was independently discovered by workers in LC in the late 1970s, led 
by Coppo et al. (1980); but it is already reflected in the behavior of relevant theories 
under fusion. Here’s the scoop. 

‘Think again about formulas of the form (A+ (B-> C))—> (B+ (A C)), which 
according to us (and Curry) are mates of the combinator C. This is not, however, a 
theorem scheme of the basic relevant logic B; but it docs give rise to a theory ~ 
namely, the set of all formulas which are provably entailed in the basic relevant logic 
by conjunctions of permutation principles. It is this theory ~ call it C also ~ which 
so wonderfully interacts with the fusion operation * on theories. Other cases are 
similar, 

‘There are profound semantical and combinatorial facts underlying these corres- 
pondences, Check again the postulates in table 13.1 on the relevant accessibility 
relation R induced by various candidate axiom schemes. Note that, in the abbrevi- 
ated relational notation, the postulates look ie the matching reduction rules for 
corresponding combinators, a8 listed in table 13.2. Think yet again of C. Its com- 
binatorial reduction rule sets Ciye = xzy. In section 13.7, in table 13.1, the permuta- 
tion axiom scheme was matched with the semantic postulate (often called *Pasch’) 
Reysw =» Rezyw, Note that the second and the third arguments are reversed, just as 
they are in the CL equality governing C. Combinator fans will note similar linkages 
with the other suggested postulates and axioms, such as B’, W, CI. 

Meyer and Routley (1972) were already in print with a key to the universe remark, 
induced by the shape of relevant semantics and the corresponding algebras. ‘They 
knew even then of the formulas-as (Curry) types connections between combinators 
and theorems of pure — intuitionist logic. But there were other candidate relevant 
axioms that appeared as though they should fit into the scheme; but which did not 
do so, Table 13.3 extends the two preceding tables (13.1 and 13.2) by incorpora- 
ting columns from both. 

Both of these candidate axiom schemes contain » along with >. W* is also known 
as WI, SII, oc Ax. xx. It has no Curry type. Yet the principles with which it has been 
‘mated ~ conjunctive modus ponens deductively and total reflexivity of the 3-place 








Table 13.3 
Combinator Reduction — Semantic ‘Typescheme 

(name) ale posrlate 

W* (Duplication) Wex>xe Rage (A+B nA)4B 

wa WBay> x(x9) | Rahe=e Raabe (BC) n(A-+B) -9(A-C) 
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relation R semantically ~ are natural (and famous). WB does have a Curry type — but 
it is the dull (A A)>(A— A), not the more exciting conjunctive syllogism 
here. 

‘These correspondences point to a deep relationship between theories, fusion and 
combinators. Return to our minimal positive logic B+; more particularly, to its 
implication-conjunction part Ba, pronounced ‘Band,’ which is determined by the 
+n axioms 1-4 of section 13.7, with the modus ponens, adjunction and affixing 
rules governing these particles. For technical reasons (largely having to do with 
modeling the bad combinator K), it is useful to extend Ba to include also the 
Church constant T, subject to the axiom schemes 


(Th AT 
(12) T+(T>T) 


‘The truth-condition on T is that it shall be marked srue at every world, which is a 
ull (and mainly silly) thing to do."* The resulting system is called BT (say ‘Bat’) 
‘Then, applying Barendregt et al. (1983) it can be scen that there is a model of LC 
(and hence of CL) in the theories of B 1." For the filters of the LC algebraists are 
nothing but the theories of the relevant logicians. And the non-empty theories of 
BATT have all the right properties, along the lines explored above, to make true the 
provable equalities of CL. Identify each combinator (B, C, W, K, $, W*, I, etc.) 
with the set of all formulas of the corresponding, scheme, closed in the appropriate 
way to make it a theory of BT. 

To be specific: Consider the combinator I, whose type scheme is A+ A. Since a 
theory is closed under the adjunction condition (i) on page 295 everything of the 
form (A> A) (B+ B) will also belong to the theory I. And since theories are 
closed under the condition of provable entailment (ii) on page 295 Iwill contain yet 
further members. Trivially, the top truth Tis one such member; more interestingly, 
formulas entailed by members of I, like any (An B)—» A, also belong to I In a 
nutshell, as readers may verify, I will consist exactly of the theorems of our minimal 
relevant logic. It follows, a8 night follows day, that, for every theory x, Ix= x. 

‘Other combinators are similar ~ except that, since their corresponding, schemes 
are only sometimes available, only those logics for which the schemes are valid will be 
closed under them. For illustrative purposes, consider the ‘wicked’ combinator K. 
Its reduction rule is Ksy=x. While hopefully one has by now excluded its mate 
A~>+(B~ A) from one’s own preferred logic, the displayed equality holds already 
for all non-empty theories x, yat the BT level. For let A € x. By definition of K, 
B+ AE Ks for all B. Something B belongs to the non-empty theory y— at least T, 
if nothing more salutary shows up. (And that is why T was added to Ba.) Detach: 
ing, A Kay, which establishes the inclusion from right to left. The converse 
inclusion is also demonstrable. Using the ideas of Dezani, Motohama and their 
colleagues ~ especially Proposition 9.6 of Dezani-Ciancaglini et al. (1998, p. 70) ~ 
all the other demonstrable equalities of the AB-calculus (and hence of its definable 
CL subsystem) are likewise modeled in the theories of Ba T. 
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‘That CL is the key to the (relevant semantical) universe means, so far, that 


{a)_ there is a minimal relevant logic Ba based on > and 
(b) the non-empty theories of this logic (with T) constitute a model for CL. 
(©) the fusion of theories models functional application in CL 

(d) combinators are the theories determined by their ‘types’ 

(e) all combinator laws hold as equalities in the calculus of theories. 


Much has been made by CL and LC theorists about the interpretations of 
formulas-as-types. Relevant logics turn this so-called Curry-Howard isomorphism on 
its head [see chapter 1]. We interpret rypesas-formalas, And our formulas really are 
formulas ~ the formal sentences of some logical language. Aggregations of formulas 
are bound into theories by conjunction and entailment. And it tums out, as Fine 
(1992b [1974]) also emphasized, both that 





(i) whole theories are the underlying ingredients of relevant semantical analysis, 
and 

(ii) the shape of the semantics for any particular relevant logic will be determined 
by the combinators that correspond to its axioms, 


Passing now to the analysis of further logical particles beyond > and a, and 
stronger relevant logics than B and its kin, the disjunction v poses immediate 
problems. One prefers (and ought to prefer) prime theories x, which satisfy, for all A, 
B, the primeness condition: 


Primeness Condition If Av BE x then AE xor BE x, 


This is wanted because it corresponds to the truth-condition on v, on which the 
truth at x of a disjunction requires the truth at x of a disjunct. But prime theories are 
not always easy to come by. Wore, even when each of the theories x, y is prime, 
there is no guarantee that their fusion ay will be prime, 

‘Nonetheless, there remains a strong relationship between combinators and the 
relational semantics. For the canonical ternary accessibility relation is definable on 
the structure of prime theories as 


Rez iff Ce 


And it tums out that, using the combinator facts about the calculus of theories 
(for a given logic L), the necessary semantical postulates on the relation R almost 
suggest themaclves. So, think briefly (but only by example) about what makes the 
relational semantics sound and complete. Suppose the familiar example, the Cscheme 
(A> (BC) + (B-+(A— ©), is an axiom scheme of the logic L. Its mated 
relational (Pasch) postulate is Raye => Rezye, whose correspondence to the com- 
binator Chas also been observed. To show that the postulate suffices to verify the 
axiom ~ hence the closure of the theory structure of L under C— one simply applies 
the SemEnt fact of section 13.5. (This is left for the reader to check.) 
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This illustrates the soundness of the ternary relational semantics for Jogics L. On 
the side of completenes, the converse for the same case will be shown. Assuming that 
all instances of the permutation scheme are theorems of L — equivalently, that the 
structure of theories of L is closed under C - it will be shown that the Pasch 
postulate holds for the canonical temary relation R defined above on the structure 
‘of prime theories of L. 

In the first place, closure of all theories x of L under provable entailment still 
means that Cx=., in the presence of permutation as a theorem scheme. Hence, 
s95= Cis say for all theories x, x, 5 of L. (The presence of v, or even ~, in the 
vocabulary makes no difference £6 this situation.) But it needs to be shown that 
permutation forces Pasch for the structure of prime theories of L. This, however, 
follows from the squeezing lemma which holds for any of our logics 1. (Anderson 
et al, 1992; Routley and Meyer, 1982a [1973]; Routley et al., 1982): 


‘Squeezing Lemma Let x, y be theories, and let 2 be a prime theory of L. 
‘Suppose xy 3’. Then there exist prime theories x’, y’ of L, such that x’y¢ 2° 
and ay’ C 2% where xx’ and ¥C 9° 


‘Then, to verity Pasch, given that the structure of theories of Lis closed under C, 
assume that there are prime theories x’, »’, a, 3°, w” such that 





(i) x’ Ca’ and 
(i) a's Cw 


‘To prove that there is a prime theory 1” such that x's" C w/ and uy’ Cw’: 


1 xy'Ca’ Hypothesis (i) 

2 x's! Ga’s'Gw’ 1, Fusion fact, Hypothesis (i), Transitvity of G 
3 xe'y'=x’¥2'Cw’ 2, Closure under © 

4 Setwaes’ Definition (but may not be prime) 

5 ice 3.4 


But then, by the Lemma, there exists a prime theory # of L, #& w’, such that 


6 wy Cw 5, Squeezing lemma 
7 wemwcw 4 


‘The conjunction of steps 7 and 6 verifies the conclusion of the Pasch postulate, on 
the antecedent hypotheses (i) and (ii). So, given that L provides permutation, the 
canonical ternary relation R on prime theories delivers Pasch, as promised. Other 
cases are similar. 

‘Thus there is a mathematically clegant relationship berween combinatory logic 
and relational semantics. This, however, is a point about the relational semantics in 
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general, rather than a feature of relevant logics. For one can give a ternary relational 
semantics for non-relevant logics (Routiey and Meyer, 1976). For example, if a 
structure of theories is closed under K, this is not very relevant! 


13.9. Relevant Results 


‘To understand a logic, one needs to understand a few of its mathematical properties. 
‘This section presents, in a non-technical way, some technical results about relevant 
logics. 


13.9.1. The admisibility of gamma 


‘When Anderson and Belnap (1975) reformulated Ackermann’s logic IT’ (1956) as 
their system E, they omitted Ackermann’s third rule of inference, named ¥, 


b~AVB 
kA 
rB 





‘Anderson and Belnap argue that a logic should not include a rule unless it includes 
the corresponding theorem scheme. In this cas, the corresponding theorem scheme is 


(~Av BoA) Bor ((~AVB)AA)>B 


Adding the latter to E, by virtue of the Lewis argument given in section 13.1 above, 
‘makes XP- valid in B. And so adding it would remove E from the class of relevant 
logics, Larckily, as Meyer has shown with Dunn (1975 [1969]) by algebraic means 
and then on his own by a technique called ‘metavaluations’ (Meyer, 1975 (1976}), 
the theorems of E are closed under 7. Thus, Anderson and Belnap’s E has the sume 
theorems as Ackermann’s logic. 

‘The importance of the admissibility of y goes far beyond proving the coincidence 
of E and IT’. It shows that a logic is characterized by its class of ‘normal’ theories. A 
normal theory's a theory that contains all the theorems of the logic, is consistent and 
is prime. 

‘Sometimes admitting y also shows that a relevant logic contains the corresponding 
logic based on the classical propositional calculus. Using ¥, one can show that R and 
E contain all of classical propositional logic (phrased in terms of negation, conjunc- 
tion and disjunction) and that various modal relevant logics contain all the theorems 
Of the corresponding classically based logics (Mares and Meyer, 1992), Unfortu- 
nately, Meyer's relevant Peano arithmetic (the system R#) was shown by Friedman 
and Meyer (1992) not to admit 7. In the process, it was also shown not to contain 
all of the theorems of classical Peano arithmetic. 
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13.9.2. The undecidability of R and E 


‘Urquhart (1992b [1984}) proved that the logics E, Rand T are undecidable. This 
result is important and the proof is very clever. Most philosophically motivated 
propositional logics are decidable ~ for instance, classical propositional logic, intui- 
tionist logic and the standard normal modal logics. In fact B, Rand T are the first 
philosophically motivated propositional logics to have been proven undecidable. 

‘The proof of undecidabilty is an extraordinary piece of work. Urquhart shows 
that there is an interesting and important link between the relational semantics for 
these relevant logics and projective spaces (of projective geometry). He then uses 
the fact that the word problem fora particular class of infinite-dimensional projective 
spaces is unsolvable to prove that the logics are undecidable. 


13.9.3. The failure of interpolation in Rand E 
(and a host of other systems) 


Another difficult and interesting proof due to Urquhart is his theorem that inter: 
polation fails in E and R as well as in T and a range of other logics. 

‘What is interesting about interpolation from a relevant point of view is that some 
relevant logics satisfy what Anderson and Belnap call the ‘Perfect Interpolation 
‘Theorem.’ Consider Craig’s interpolation theorem as stated for classical propositional 
logic {see chapter 1, page 31}: 


Suppose that C is derivable from A. Then, 
(Cop-out) if A is not a contradiction and C is not a tautology 
there is some formula B such that 


(a) B contains only propositional variables that occur in both A and C; 
(b) Bis derivable from A; and. 
(©) Cis derivable from B. 


‘The Perfect Interpolation Theorem is the same as Craig's theorem with the omission 
Of the qualification Cop-out. 

‘Some relevant logics do satisfy the Perfect Interpolation ‘Theorem. For example, 
‘McRobbie (1979) showed thar the system OR (which is R without the distribution 
axiom, Axiom 8, given on page 292) is perfectly interpolable. But Urquhart (1993) 
again used the relationship berween projective geometry and the relational semantics 
to show that a range of relevant logics around and including E, Rand T do not 
interpolate.” 
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13.9.4. Boolean conservative extension results 


Meyer and Routiey (1982 [1973]) show that one can add a second, Boolean, 
negation to certain relevant logics without altering the stock of theorems that the 
logic has in the old vocabulary. This is what is called a conservative extension result, 
Boolean negation, -, is governed by some very unrelevant looking principles, such as 


(ANA) 98 
Bo (nav A) 


‘This extension is interesting for a variety of reasons, mathematical and philosophical. 
First, it allows one to use what is sometimes called “denial negation’ (a negation that 
expresses the failure of something to be true) in relevant logic. Second, the conser: 
vative extension result has enabled Belnap (1992b [1982]}) to prove the correctness 
Of his elegant proof theory ~ Display Logic ~ for a range of relevant logics. 

‘The conservative extension result holds for a wide range of logics, including B 
and R. But it does not hold for E (Mares, 2000) nor for NR, mentioned above 
(Meyer and Mares, 1993), (Recently Ross Brady has proved that quantified R is 
conservatively extended by the addition of Boolean negation. As of the writing of 
this chapter, he has not yet published this result.) 


13.9.5. The consistency of relevant set theories 


Brady (1983) showed that a class theory with a nalve comprehension axiom, based 
fon a weak relevant logic, is consistent. The comprehension axiom is 


BYVX(XE Yeo A) 


where X and Y are variables ranging over classes. This says that for cach open 
sentence A, there is a set that is its extension. This axiom was restricted in classical 
set theory because it enabled the derivation of Russell's paradox {see chapter 3]. 
But, using his logic, DJ“Q, as a base, Brady (1989) shows that a theory of classes ~ 
and’ in Brady (2001), a theory of sets ~ that incorporates naive comprehension is 


consistent. 


13.9.6. The completeness and incompleteness of 
‘quantified relevant logic 


Fine (1992a [1989]}) shows that the quantified relevant logic, RQ, is not complete 
over its constant domain semantics, According to the constant domain semantics, 
cach world has the same stock of individuals and the truth condition for the universal 
‘quantifier is the standard clause from modal logic [see chapter 7], namely, 


IA, a) = true iff 1(A, a) 





true, for every m, x-variant of ¥ 
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Fine shows that there isa thesis valid over the class of constant domain models for 
R that is not provable in the logic RQ. How to axiomatize a logic complete over 
the constant domain semantics is still an open question. 

Fine (1992c [1988}) developed a variable domain semantics for RQ. The semantics 
is quite complicated and appeals to 2 special notion of arbitrary objects; but one 
can make contact with more familiar terrain by linking Fine’s ideas to central ones 
from Kripke’s model theory for intuitionistic logic (see chapter 11]. Altering Fine’s 
notation slightly, each word is linked to other worlds with larger domains of indi 
Viduals by a relation, Q. Thus, Qab only if Dia) C D(d). Fine’s truth condition for 
the quantifier may be put as 


IVs, a) = true iff 1A, 6) = true, for all b that Qub and every w, variant of » 


In effec, this says that a universally quantified sentence, WxA, is true at a world, 4, 
iff the open sentence A is true of everything in every world b larger than a, 


13.9.7, The P-W problem 


‘The logic P-W, as itis usually called (or T.,-W as it should be called) has implica 
tion as its sole connective. It contains as axiom schemes the type schemes for the 
combinators B, B’ and I and is closed under modus ponens. Belnap had conjectured 
that, for formulas A and B, the only cases in which both A» B and B+ A are 
provable in P-W are those in which A and B are the same formula. In the late 
1960s, Larry Powers showed that this conjecture is equivalent to the conjecture that 
‘no formula of the form A» A can be proved in the logic 8, which is the closure 
‘under modus ponens of the schemes Band B’ alone, Martin (1992 [1978], and 
Martin and Meyer (1982) proved Powers’ $ conjecture. So Martin solved the P-W 
problem (which had been shaping up as the Fermat's Last Theorem of the area), 
Martin's solution of the P-W problem is a wonderful example of technical inge- 
‘nuity and philosophical insight going hand in hand in the advancement of relevant 
logic. For one now has a logic that does not make valid any form of circular 
reasoning, It has been thought since the beginning of logic that deriving a propos 
tion from itself is not only useless but also fallacious. In $ one has a logic that rejects 
(root and branch) all forms of circular reasoning. Thus it serves as a test-bed for 
‘ideas about circularity and how to avoid it" 


13.10. But There’s So Much More to Say 


‘This chapter has provided the reader with a brief look at the motivation for and 
some of the technical and philosophical aspects of relevant logic. But it has only 
scratched the surface of this vibrant field of logic thar has been the focus of fairly 
intense mathematical scrutiny and philosophical debate in the past four decades. 
It has not touched on the use of relevant logic in automated theorem proving, 
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(Thistlewaite et al., 1988), or its relationship to linear logic or to computing more 
generally. Nor has it discussed the sometimes heated debate over the status of 
disjunctive syllogism (Read, 1988). And there is so much more. ‘There are areas in 
relevant logic that are just now beginning to be explored: The relationship between 
these logics and logics of natural language conditionals, the use of relevant logic in 
‘non-monotonic reasoning, among others. Hopefully, this chapter will inspire readers 
to delve into these areas on their own, sadly without our guidance 


‘Suggested further reading 


“The best detailed introduction to relevant logic is Dunn (1984), which has recently been 
updated (Dunn and Resa, 2001) For the philosophical debates surrounding relevant logic, 
Routley, etal (1982) and Read (1988) are good places to start A fine and very readable 
introduction to substructual logics, with mach about relevant logic, i Restall (2000). Mares 
(2002) inroxtuces relevant logic through natural deduction. For the reader who wants tech 
nical dels and proofs of theorems, Anderson and Belnap (1975) and Anderson etal. (1992) 
are excellent sources. Andenion ct al. (1992) contains a detailed, athough now out of date, 
bibliography of work on telvant logic, compiled by Robert G. Wl.” 


Notes 


or the pre-history and history of relevant logics, a good source is Read (1988) 

Relevant logic locates the fallacy at line 5, denying the entailment (~A «(AV B)) -» B 

See Anderson and Belnap (1975, section 16.1) or Read (1988) for discussion ofthis issue. 

3 This holds for propositional relevant logics without so-called Ackermann constants # and 

for Church constants Tan F (see section 13.8). 

‘We are very grateful ro Nicholas Griffin for pointing out this passage 10 us, 

For a clear presentation of the relevant deduction theorem, sce Dunn (1984). 

6 There are several good introductions to natural deduction for relevant logics such a6 
Anderson and Belnap (1975), Dunn (1984), and Mares (2002). 

7 _Asfor other properties of fusion, these vary among relevant logics. Not all relevant logics 
allow fusions to commute. That is, there are relevant logics in which A® B is not 
‘equivalent to Be A, but there are others, lie R, where + A* B+ B» A. Fusion is not 
idempotent in mast relevant logic, ie. n systems like R and B, A Ais not equivalent to 
A. And 30 00. 

8 This is what Meyer and Martin (1986) call the Australian plan of relevant semantical 
analysis. There is a contrasting American Plan, developed by Dunn (1992 [1976], 
1969) and championed and augmented by Belnap (1977, 1992a {19771}, that utilizes a 
natural fourvalued semantics to refute XP-. For simplicty’s sake, in what follows we 
discuss only the two-value semantics with the Routley star. Interested readers should 
‘consult Routly et al. (1982) for the technical details of the completed American plan. 

‘9 There is another interpretation of negation that uses the compatibility relation, but for it 
this relation is not a semantical primitive. This is the implicational interpretation of 

negation of Mares (1995). On this semantics, there isa falsum, f, that is taken to be true 

at all and only imposible worlds (impossible worlds too can be defined in tems of other 

_rimitives). On this semantics, one can define Cab to hold iff there is some world ¢, such 

thar Rabe, and ¢'s not impossible. In ocher words, two worlds are taken to be compat 

ible if they can be combined (in the sense of fusion) in a posible word, 
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10 We do not try to give a full account of combinatory logic here. We just want to give the 
reader the flavor of the theory. For a more detailed introduction, sec Hindley and Seldin 
(1986), 4. foe furber bibliographical references to CL and LC. 

11 Curry was anticipated in 1924 by Schonfinke. Church imvented the Kindred LC circa 
1932, 

12. Surely speaking we should distinguish the genera exuction raion >, which is reflexive, 
transitive and sities postive monotonic replacement, from >, of immediate (one step} 
reducibility. Equality, *=,” is the symmetric transitive closure of >. 

13 For an introduction to types, with references and especially in LC, see Takahashi et al, 
(1998). 

14 For technical reasons, we will ad (top) Church constant below; it sa member ofall 
‘non-empey theories. Then the fasion of two nan-empry theories in 6, T (atleast) is 
Also a non-empty theory. 

15 Butt does correspond tothe constant «9 which isthe wbale domain ofthe models of LC 
discussed in Barendregt eal. (1983) and Dezani eal. (1998). While we prefer the more 
natural Ackermann commant ¢ in relevant logics, T continues to make some sense as 
‘the trivial truth implied by absolutely everything. Ackermann 1, when present, admits the 
2Dsided rule FA iff +» A. Think oft a8 conjunction of truths (interesting) but T as 
2 comreponding diguacion (boring, but that many logics cafe # and T1). 

16 In essence, the LC investigators had rediscovered the basic poitive relevant logic of, 
i, Meyer and Routley (1972). Alas, they thought chat they had invented 8 fype theory, 
tnd were not aware that they had stumbled on 4 releran lagic. When Meyer frst met 
Barendregt in 1990, he pointed this out, Barcodregr conceded that, while he had aot 
previously thought much of relevant logics, perhaps it was time to change his tune; for 
‘ow it was clear that he had Been involved (with members of the Torino group) in the 
(relinvention of one 

17. This means, in McRobbie’s lingo, that B is met reasonable in the xense of Andcrion and 
Beinap, 

18 More acurately we have such 3 logic in S, (and, thanks to more joint work by Fine and 
‘Marin (2001), in S28 well). Further particles (and, it may be, farther axioms) await. 
We add that it i a pty thatthe elegantly combinatorial Martin inights are as yet 
insutfiently appreciated. 

19 Thanks are de to Neil Lae, who commented extensively on an eater draft, a8 did 
{Lou Goble, Katalin Bimbo, Chis Mortensen and Beate Elsner 
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Chapter 14 


Many-Valued Logics 
Grzegorz Malinowski 


Classical logic is based on the principle of bivalenc, that every proposition has 
‘exactly one of the two logical values #ruth or falsity. This finds expression in the wo 
laws: the daw of the excluded middle 


(EM) py—p 
and the daw of nom-contradiction, 
(CP) pan) 


With the classical understanding of the connectives, EM and CP may be read as 
stating that of the two propositions p and =p, at least one is true and at least one is 
false, respectively. 

‘The most natural and straightforward step beyond two-valued logic isto introduce 
‘more logical values, thereby rejecting the principle of bivalence. Another, indirect, 
‘way consists in challenging the classical laws concerning the sentence connectives 
and introducing other non-two-valued connectives into the language. Either way, 
propositional logic seems fundamental to many-valuedness, rather than its first-order 
‘extension, Hence, although there has been interesting research into first-order many- 
valued logics, we shall confine our discussion here to the O-order case 

While the roots of many-valued logics can be seen in Aristotle — with his famous 
concer for future contingents and the ‘sea-battle tomorrow” - and traced through 
the middle ages and the nineteenth century, the real ‘era of many-valuedness’ began 
in 1920 with the work of Lukasiewicz and Post. This chapter looks at each in turn, 
and then some others. 





14.1. Lukasiewicz Three-Valued Logic 


Lukasiewicz firs introduced a thind logical value ~ which can be called}, in addition 
to 0 and I for falschood and truth ~ as a result of philosophical investigation into 
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ideas of freedom, indeterminism, future contingents, modality, and aso the paradoxes 
of set theory. With Aristotle, he argued that a sentence like 


{ shall be in Warsaw at noon on 21 December of the next year. 


is, at the time ofits urterance, neither true nor false, since otherwise fatalist conclusions 
about necessity or impossibility of contingent Future events would follow. The value 
+$0was to apply to such cases. Initially, Lukasiewicz interpreted this third logic value, 
4, a8 ‘possibility’ or ‘indeterminacy’, and, following his intuitions about these con- 
‘cepts, he extended the classical interpretation of negation and implication according 
to the tables:! 





‘The other connectives of disjunction, conjunction and equivalence were introduced 
through the definitions: 


avBay(a—B) +B 
ar Pry Xav-P) 
= Bay (a>) A (Ba) 


‘Their tables are: 





o 
0 
4 
1 


4 $ 
+ 0 
+ $ 
1 + 


Seole 





Heep 





AA valuation of formulas in Lukasiewicz three-valued logic is any function v: For 
+ (0,4, 1] compatible with the above tables, where For isthe set of formulas of the 
language. A tautology is a formula which takes the desigmated value 1 under any 
valuation v. 

‘The set Ly of tautologies of this three-valued logic of Lukasiewicz. differs from the 
set of two-valued tautologies TAUT of classical logic. For instance, neither the law 
of the excluded middle, nor the principle of contradiction isin Ly. To see this, assign 
to p: any such valuation also associates } with EM and CP. The thorough-going, 
refitation of these two laws was intended, in Lukasiewicz’s opinion, to codify the 
principles of indeterminism. 

Another property of new semantics is that some classically inconsistent formulas 
are no longer contradictory in Ls, One such: 


CC) psp 
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is connected with the famous Russell paradox of the ‘set of all sets chat are not their 
‘own elements’ [see chapter 3]. Rusell’s set is defined by the equation 
Za (s:x€x4 
‘And the resulting paradox 
ZEZ=ZEZ 


is an instance of (*). Russell's paradox ceases to be an antinomy in Ly, however, 
since putting $ for p makes the formula true and therefore (*) is non-contradictory. 
tatkasiewicz found this to be a strong argument in favor of his three-valued logic. 

Lukasiewicz alo sought to formalize the modal functors of possibility M and 
necessity I. Aware of the impossibility of representing these functors in truth- 
functional classical logic, he proposed taking the three-valued logic as their basis 
instead, In 1921, Tarski (Lukasiewicz, 1967 [1930]) produced simple definitions, 
using negation and implication, of these two connectives that would meet 
Lukasiewicz’s requirements, namely: 


{Me x{ Lx 
Oo OPO Aamy are 

$]1 0g )0 Lamy aMne=Ya+—a) 
ih, alt 


‘The first known axiomatization of a system of many-valued logic was Wajsberg’s 
(1967 [1931 axiomatization of L,, namely, for the (=, +)-version of Lukasiewicz’s 
three-valued propositional calculus: 

Wl pig) 

W2 (P99 >(P>) 

W3 (p>) > (9->p) 

WE (p>) >P 
with the rules modus ponens (MP) and universal substitution (SUB). The result 


‘obviously applies to the whole L, since the other Eakasiewicz connectives are definable 
in terms of negation and implication 


14.2. Post Logics 
As an outcome of his esearch on the classical propositional logic, Post (1920, 1921) 


construed a family of finite-valued propositional logics. This was inspired by the 
formalization of the classical propositional calculus, CPC, presented in Principia 


ait 
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‘Mathematica of Whitehead and Russell (1910), by the method of truth tables, and 
by Post’s own results concerning functional completeness for classical logic. 

Following Principia Mathematica, Post takes negation, —, and disjunction, v, as 
primitive connectives. For any natural m= 2, he considers a linearly ordered set of 
objects 


Pee Min tayo 5 bal 


1, <5 iff (if and only if) #<j, equipped with two operations corresponding to 
connectives; unary rotation (or cyclic megation) — and binary disjunction v defined a8: 


ni hem 
a= 
if iam 


YB faa, 


‘These equations define, for a given m= 2, melement truth tables of negation and 
disjunction. Thus, e.g., for n= 5, the tables are: 








bh ht 
ah bh ww 
lh hh & & 
lh hhh & 
lh hh 
KI RR 


It is easy to see that for m= 2, Post logic coincides with the negation-<isjunction 
version of the clasical logic: the set Fy =f, may be identified as containing 0 
and 1, respectively, and then the Post negation and disjunction are is 

variants of the classical connectives? The relation to CPC breaks down for n> 2. In 
all these eases, the truth table of negation is not compatible with the classical one, 
‘To sce that, remark that due to the properties of disjunction, f, always corresponds 
to 0 and f, to 1. Though “4, = #,, +4, equals f, which is not ¢,. Accordingly, it can 
be said that the m-valued Post algebra 





Phe (Ihnen tah YD 


cither coincides with the negation-disjunction algebra of CPC (= 2), of the latter 
algebra is not a subalgebra of it (m > 2). 

Post considers the ‘highest’ value 4, as the distinguished element. Itis remarkable 
that among the laws of all its logics (n> 2) generalizations of some significant 
tautologies of the clasical logic expressed in terms of negation and disjunction are 
still present. For example, this counterpart of the law of the excluded middle: 


YA BY PY Vo 
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is such a formula. By contrast, applications of classical definitional pattems for other 
standard connectives, like conjunction, implication and equivalence, lead to very 
strange results. For example, the definition of conjunction using the DeMorgan law 


anB=av-B) 


results in a non-associative connective A! The source of unexpected propertics is, 
manifestly, the rotate character of Post negation. 

‘The most important property of Post algebras is their functional completeness: 
Every finite-argument function on P, can be defined by means of the two primitive 
functions. In particular, then, also the constant functions and hence the ‘logical 
values” fy, fy... yf themselves. Establishing functional completeness was one of 
Post's primary aims. 

Post's original construction, definitely algebraic, was eventually provided with an 
interesting semantic interpretation. Post suggests regarding the elements of P, as 
‘objects corresponding to special (n~ 1)-element tuples P= (py, fs, -» Pu.) of ordi 
nary two-valued propositions fi, Ps...» Pe1- More specifically, P, is replaced with 
the ‘space’ E*' of such tuples subject to the condition that the true propositions are 
listed before the false. The connectives on E™! are defined as: 





APs formed by replacing the first false clement by its denial, but if there 
is no false element in P, then all are to be denied, in which case —P is a 
sequence of false propositions 

(¥) When P=(P,s Pas ++ Pes) and Q= (4 y+» deadethen Pv Q= (Pv thy 
PAY Mayo Pet ¥ fo 


‘The mapping i: E' > P, 
i(P)= iff P contains exactly (i 1) true propositions 
«establishes an isomorphism of (E™!, v, +) onto the Post algebra P,. For example, 


the universe E*, corresponding to the case of five-valued Post logic considered 
before, consists of these 4-tuples: 


(0, 0, 0, 0) 4 
(1, 0, 0, 0) & 
(1,1, 0, 0) Cy 
1,1, 0) 4 


Q, 11,1) 


This interpretation of logic values and its algebra shows, among other things, that 
the values in different Post logics should be understood differently. 

Post (1921) also defined a family of purely implicative w-valued logics. The faily 
is fly extensive and it covers implications designed by other authors, ¢g. Lukasiewicz 
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and Godel. The novelty of this trath-table proposal was that Post designated many 
logical values at a time. That possiblity, quite narural nowadays, was not considered 
bby many of the originators of many-valued logics. 


14.3. Three-Valued Logic of Kleene 


Kicene (1938, 1952) is the author of two systems of three-valued propositional and 
predicate logic designed to allow for the indeterminacy of some propositions at a 
certain stage of investigation. These were particularly inspired by research in the 
foundations of mathematics and the theory of recursion, where there was need for 
tools that render the analysis of panilly defined predicates, or propositional func- 
tions, possible 

‘To be aware of the necessity for such logic(s), consider a simple example of such 
«predicate, the mathematical property P defined by the equivalence 


pitts t=? 


where x is a variable ranging over the set of real numbers. It is apparent that the 
propositional function P(x) is undetermined when x= 0, More precisely, 


true if}sest 
Proposition Pa) is Jundetermined if = 0 
false otherwise 


‘The starting point of Kieene’s (1938) construction consists in considering, also. 
the propositions whose logical value of truth (T) or falsity (F) is undefined, undeter- 
‘mined by means of accessible algorithms, or not essential for actual considera- 
tion. The third logical value of undefiniteness (U) is reserved for this category of 
propositions. Kleene’s counterparts of the standard connectives are defined by these 
tables: 





aja SIF UT viF UT ajFUT sjF UT 
F(T F(T? T FlF UT F[FE F FITUF 
uju uljuuT ujuur uruv vluuy 
tie tirut titt tT tleut tTlrouT 


‘One may easily notice that, as in Lukasiewicz logic, the connectives’ behavior towards 
the classical logical values T and F remains unchanged. Furthermore, the classical 
interdefinability of o> B and —a v B is preserved. 

Kleene takes T as the only distinguished value, with the result that that no 
formula is a tautology. This follows from the fact that any valuation which assigns U 
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to every propositional variable also assigns U to any formula. It is striking that a 
‘conservative’ extension of the two-valued logic should reject all classical autologies, 
even p> p and p= p. 

Komer (1966) provided the most accurate and compatible interpretation of Klcene’s 
connectives, defining the notion of an inexact class of a given non-empty domain A 
generated by a partial definition D{) of some property P (of elements of A) to be 
a three-valued ‘characteristic function’ Xp: A |-1, 0, +1): 


[-1 when Pia) is false according to DXP) 
X,la)=40 when P(a)is DPpundecidable 
[+1 when Pia) is truc according to DXP) 


Any family of inexact classes of a given domain A forms a DeMorgan lattice, whose 
algebraic operations U, 1 and ~ 


(XU YXa)=maxtXta), Ya) 
(X10 ¥\(a)= min[ Xa), ¥(a)) 
(-X\(a)=-Xa) 


are counterparts of the Kleene connectives. Korner’s ideas have recently been revital- 
ized in the theory of rough sts of Pawlak (1991) and the appraximation legic based 
fn it (Rasiowa, 1991; Bole and Borowik, 1992) 

Kleene (1952) refers to these connectives as ‘strong’ and introduces another set of 
‘weak’ connectives. With negation and equivalence the same, he defines the three 
cothers by the tables 


‘The new truth-tables are to describe the employment of logical connectives with 
respect to those arithmetical propositional functions whose decidability depends on 
effective recursive procedures. They are constituted according to the rule that any 
single appearance of U results in the whole context taking U, the motivation being. 
that indeterminacy occurring at any stage of computation makes the entire proced. 
ure undetermined. 


14-4. Bochvar Logic and Beyond 


Bochvar’s (1938) conception of the three-valued logic is based on the division of 
propositions into sensible and senseless, and then ‘mapping’ that into a two-level 
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formal language. A proposition is meaningful if itis either true or false all other 
sentences are considered as meaningles or parndaxical. This approach was designed 
for solving paradoxes that emerge from classical logic and set theory based on it. 

‘The propositional language of Bochvar logic has two levels, which correspond to 
object language and metalanguage. Both levels have their own connectives that are 
counterparts of negation, implication, disjunction, conjunction and equivalence. The 
two planes of Bochvar construction correspond to Kleene weak logic (intemal) and 
to classical logic (external), respectively. The internal connectives are conservative 
three-valued generalizations of the classical ones; they are denoted here a8 -, -%, v, 
‘Aand =. The external connectives are devised 10 characterize the relations between 
logical values of propositions. They are ‘metalinguistic’ and incorporate the expres- 
sions *,...is true” and *. .. i false.” These are marked as starred counterparts of the 
standard connectives, and understood in the following. way: 





external negation: = "Qin false 

external implication: >" Bit ais true, then B is true 

external disjunction: av* Batis truc or Bis true 

external conjunction: &x* Batis truc and B is true 

external equivalence am* Bovis true iff B is true 
‘The truth tables of intemal connectives have been compiled according to the rule 
which follows Kleene’s principle: “every compound proposition including. at least 
‘one meaningless component is meaningless; in other cases, its value is determined 
classically.” Thus, the intemal Bochvar connectives coincide with Kleene’s weak 
connectives, as given by the tables fom the last section, but with U now for the 


value ‘meaningless.’ The truth-table description of the second collection of Rochvar 
connectives is 








dom 
za 








male 


antl 


aes 
TT 
FF 


ann 


An important property of Bochvar construction, which makes it more natural, is, 
the compatibility of two levels. The passage from the internal to external level 
is assured by the so-called external asertion “is true,” A*. The truth-table for this 
connective, and the intuitively justified definitions of the extemal connectives, is: 
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Home 


Where 


atasy Ate 
a>" Bay Ata A’ 
avt Bay Atay Arp 
Aton AB 
amt Bay Ata= AB 








Bochvar takes T as the designated value and thus obtains a construction that coin- 
cides with the Kleene weak logic. Like that, Bochvar’s internal Jaic has no tautologics. 
Finally, the external legic is classical logic; tis is due to the fact that the truth tables 
of all external connectives “identif¥" the values U and F, while the behavior of these 
connectives with regard to F and Tis classical, 

Several authors have taken up Bochvar's idea s0 as to develop systems appropriate 
for dealing with vagueness or of the logic of nonsense. Halldén (1949), for example, 
rediscovered Bochvar logic for these purposes. Halldén adopts three logic values: 
falsity (F), truth (T) and ‘meaningless’ (U), and the same niles for the connectives 
Cf negation and conjunction as in Bochvar’s internal logic.’ Halldén’s system, how: 
ver, fers from the latter. Fist, it has a new one-argument connective + serving to 
express meaningfulness of propositions. Thus, if is meaningless, then +0 false 
Otherwise, +4 is true. Second, Halldén distinguishes two logical values U and T. 
‘Therefore, a formula is valid if it never takes F. In consequence, the set of valid 
formulas not containing + coincides with the set of tautologies of CPC. The con- 
struction, however, differs from classical logic by its inference properties. The logic 
‘of nonsense heavily restricts several rules of inference, including modus ponens. Thus, 
in general, ¢ does not follow from p—> ¢ and p. To see that, it suffices to consider a 
valuation for which p is meaningless and ¢ is fale. Under such valuation, g is not 
designated, while the premises a meaningless are both designated. 

Haalldén provides a readable axiomatization of his logic. To this aim, he intro- 
duces the connectives of implication, >, and equivalence, =, accepting standard 
classical definitions and two standard inference rules MP and SUB 


HI p> p>? 

H2 p>(-p>a) 

H3 (p> NG (07) 
Ha speenp 

HS Hpag)=tpate 

He psp 





317 


Greegorz Malinowski 


In this framework, it is easy to define a dual to the + connective, putting: -@=y +. 
“Thus, as +0 corresponds to “ais meaningful,” ~a stands for “ais meaningless.” 

Further elaboration of Halldén’s approach is made by Aqvist (1962) and Segerberg 
(1965). Starting with problems arising with normative sentences, Aqvist created a 
propositional calculus that is a minor variaot of Lakaslewice three-ralued logic, or 4 
fragment of Kleene strong logic. The three primitives of Aqvist’s logic are: negation, 
=, disjunction, v, and a special connective, #. Their tables use the three values: F, U, 
‘T (in our notation), where the intended meaning of F and T is standard and Tis the 
‘only designated value. Finally, the tables of negation and disjunction are the same, 
modulo notation, as the truth-tables of Lukasiewicz three-valued connectives. In 
tum, # is defined as: 





AF)=U)=F and A(T) =T 


‘As a result, this coincides with the Lukasiewicz ‘necessity’ operator L (see sec- 
tion 14.1), 

Given the philosophical application of his formal approach, Aqvist defines three 
‘characteristic’ functors of the system: 


Fasgta  La=ytavFa | Ma=y—La 


‘whose intuitive reading is: ‘ais false’ (Rx), “a is meaningful” (Za) and ‘a. is mean- 
ingless? (Ma). 


14.5. Logic Algebras and Matrices 


‘The algebraic approach is an efficient tool of logical investigation; See, for example, 
Rasiowa (1974) and Wojcicki (1988). Its use in the case of many-valued logic is 
‘especially natural, and it enables a better insight into problems of many-valuedness, 
A propositional language is viewed as an algebra of formulas 


Le (For, Fi... Fa) 
freely generated by the set of propostional variables Var= |p, 4,r,..; the connec- 


tives Fi... F, being finitary operations on For. An interpretation structure A for 
Lis an algebra 








(As fis «++ sf) 





similar to it. Any mapping s: Var—+ A, may be extended uniquely to the homo- 
morphism b,:L—> A, ,€ Hom L, A). Interpretation structures equipped with a 
distinguished subset of clements to correspond to propositions of a specified kind 
(e.g. true sentences) are called logical matrices. More specifically, a pair 
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M=(A, D) 
with A being an algebra similar to a language Land DC A, will be referred to as a 
‘matris for L. Elements of D will be called designated (or, distinguished) clements of 
ME The set of formulas which take designated values only: 
E(M) = {a € For: ba € D for any bE Hom, A)} 


is called the content of M. The relation Fy is said to be a matrix consequence relation 
of M provided that, for any XC For, @ € For, 


X bya iff for every bE Hom(L, A), bar € D whenever bX CD 
‘The content of a matrix is a counterpart of the set of rautologies and the ental- 
ment relation Fy € 2" x For is a natural generalization of the relation of classical 
consequence. 

To illustrate these concepts, the language of classical logic is most familia: 

Le (Fer, 9%.) 
Then, the two-clement algebra of the classical lagic is of the form 
A:= (10, yy 0, =) 
(The same symbols 3, >, v, a, = may be used as for connectives 10 denote 
corresponding operations on (0, 1] as determined by the truth-tables.) The classical 
matrix has the form 
My=(10, My YIU) 

and the classical consequence relation in this notation is characterized as follows: 

Xb, a iff, for every bE Hom(L, A;), ha= 1 whenever bX (1) 
Notice that the set of tautologies is the content of My and it consists of formulas 
that are ‘consequences’ of the empty set, TAUT= EM) =|: ©; a]. The so- 


called deduction theorem for classical logic expressed semantically in terms of F says 
now that for any set of formulas X and a, B € Fer, 





(ded;)— X, at; Bi Xba 

‘To see how the framework of matrices and consequence relations works for many- 
valued logic, consider a few properties of three-valued logics already presented. First, 
the matrix of the three-valued logic of Lukasiewicz: 


My= (10, Ih, ves LD 
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with the connectives set by the tables in section 14. 
that L, has the deduction theorem in this form 


It is now possible to check 





(dedy) Xs Bit XFa>(a—>B) 


‘The left to right direction is essential. To see why the antecedent a appears twice, it 
suffices to consider a valuation b, which sends all formulas from X into {1} and such 
that ba =}, bB = 0. For the same reasons, (ded) fils in this case 

‘The last example shows that a many-valued logic may differ not only on the level 
of tautologies, but also with respect to the rules of consequence relation and, 
ultimately, by the set of inference rules. Another case where the consequence rela- 
tion is important, occurs when the logic has the empty set of tautologies, such as 
‘one finds with Kleene and Bochvar logics. The matrix of the weak Kleene logic and, 
thus, the intemal Bochvar logic is 


Ky= (IF, U, Th, ¥ 4 ITI) 


with the second collection of operations in section 14.3. It was already stated that 
this logic is non-tautological, BUK,)=@. However, Kleene logic is non-trivial since 
the consequence bx, determined by its matrix is not. It is noteworthy that the set of 
rules of Fc, consists of some special rules of the classical logic, Namely, for any 
classically consistent XC For, ie. such that for some valuation bEHem(L, A,) 
XG II 


Xb, Lif XE, O and Var(a) ¢ Var(X) 


‘The use of logical matrices is undoubtedly the most natural way of achieving 
rmany-valuednes ie. consequence relations different from F;. Two cases of obtaining 
a genuine logic of this kind have already been discussed, Nevertheless, taking a 
‘multiple-element matrix as a base for the logical construction does not guarantee its 
‘many-valuedness. Also, there are different kinds of many-valuedness. Consider, for 
instance, the matrix 








Wy=(10, TU vs a IT) 
with operations defined by the tables 
z[e s/0T1 vioTi ajoTi sjoTd 
ofr Of1 T1 ofo Ti oOfo00 ofioo 
tio tTjorTr rit T1 TloTT TloTT 
lo alora alts aa alora ilorad 











Notice, that with every #€ Hom(L, W,) the valuation i* © Hom(L, M,) corres- 
ponds in a one-to-one way such that ba € (T, 1) iff #*a= 1. Therefore, Fy,=Fu, 
and W, is nothing more then a three-element model of the two-valued logic.* 
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The last, somewhat striking, case is when a muliple-clement matrix retains all 
classical tautologies, ie. its content coincides with TAUT, but its consequence 
relation differs from the classical by some rules of inference. The matrix 


Ky =((F, UTI 4, =, 1, TH) 


is like the Kleene~Bochvar matrix K, but having two elements U and T designated; 
it has this property. Its consequence operation >. falsifies MP, since the inference 
(Pp & P) Fx ¢ does not hold. 


14.6, Functional Completeness 


‘A logic algebra is Functionally complete when all finitary operations on its universe 
are definable by use of its original operations. Classical propositional logic is func: 
tionally complete in this sense. In the terminology just adopted one may say equiva: 
fently that the algebra A; and, consequently, the matrix Mf, have this property. 

Where » = 2 is a given natural number, put £,= (1,2...) and by U, denote 
any algebra of the form 


U,= (Eas fin sSa) 


with fi, fa being finitary operations on £,. U, will be called functionally complete 
if every fnitary mapping f: E! —» E, (& = 0, k finite)* can be represented as a compori- 
tion of the operations fis 

‘This definition of functional completeness is due to Post (1921), who reduced the 
complexity of the problem to a small number of connectives. If one requires only 
that for some finite m any argument operation on E,, where k= m, is definable 
then U, is said to be functionally complete for m variables The logical counterpart of 
the last definition is tha it warrants defnability of all at most m-argument connectives. 





‘Theorem 14.1 (Post, 1921). If U, is functionally complete for m variables, 
Where m 2 2, then is also functionally complete for m+ variables and hence 
also functionally complete. 


Note that theorem 14.1 reduces the fanctional completeness of A, to the definability 
of all 4 unary and 16 binary connectives. In turn, it is easy to show that the 
connectives of the standard language define all twenty. Post himself provided several 
‘other small collections to do the same and there is also a ‘minimalist’ reduction of all 
classical connectives to a single one, the so-called Sheffer stroke. 

‘Obtaining the functional completeness of m clement algebras of logic was another 
‘motivation for building many-valued logic. Post was the frst to give such an algebra 
‘gencrating two functions: the one-argument cyclic rotation (negation) and the two- 
argument maximum function (disjunction) of section 14.2. In the present notation, 
these look like 
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Consequently, every Past algebra 
Pe=(Eq nn Y) 

is functionally complete (Post, 1921). It is of interest to notice that 
Pi2 (Ex ¥) 


is the (-, v)-teduct of the algebra A, 

‘The functional completeness of the m-valued logic algebras is important because 
propositional logics founded on such algebras are logics of all possible extensional 
valued connectives (truth functional when m= 2) and, for every 1, they are, in a 
sense, unique. Since functional completeness is not a frequent property, several 
criteria have been formulated which might help to determine its presence, 


‘Theorem 14.2  (Shipecki, (1939) An wvalued algebra Uj(n = 0, m finite) is 
functionally complete iff in U, there are definable: 


(i) all one-argument operations on E, 
(ii) at least one two-argument operation fix, y) whose range consists of all 
values i for 1m 


Using theorem 14.2, one may easly establish the functional incompleteness of all 
the three-valued logics described above, excepting, of course, the Post logic. Thus, 
for example, for the Lukasiewicz three-valued logic 1, it suffices that the one: 
argument constant function T: Tx=4 for any x € (0, 4, 1} is not definable in terms 
of the basic operations (connectives) in I... For consider any compound function of 
‘one-argument and assume that x'€ [0, 1}, then, duc to the tables of the primitive 
connectives, the output value cannot be equal to $. On the other hand, the same 
criterion implies that adding T to the stock of functions of L, leads to the function- 
ally complete logic algebra (Shapecki, 1967 [1936]). Furthermore, Stupecki proved 
that the set of axioms W1-W4 of Wajsberg (see section 14.1) together with 


WS Tp 
We Tp Tp 

axiomatize the functionally complete version of Lukasiewicz’s three-valued logic. 
Stupecki (1939) also constructed the largest possible class of functionally com: 


plete logics and gave a general method of their axiomatization. The Stupecki matrix 
Su (m being a given natural number, 1 = & < ») is of the form 


a= (11,2)... 9, RS (1,2... 
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‘where > is a binary (implication), and R, S unary operations defined by 


y Wlsxsk 
oye 
q itk<x<n 
[e+ 
Rix) = 
a 





Here the similarity to Post logics is evident: R is the Post negation and Sa sort of 
‘permuration’ function, Finally the classical propertis of implication enabled Stupecki's 
‘original axiomatization of the class of logics defined by the above matrices. 

‘The next section looks briefly at some infinite-valued logical constructions. These 
are all functionally incomplete, due to the fact that the set of possible functions of 
any algebra of this kind in uncountably infinite, while using a finite number of 
‘original operations one may define, at most, a countable family of functions. 


14.7. Lukasiewicz Logics 


In 1922, Lukasiewicz generalized his three-valued logic and defined a family of 
‘many-valued logics, both finite and infinite-valued. (Lukasiewicz, 1970c, p. 140.) A 
Lukasiewics n-valued matrix bas the form 

My= (Ing ys Yo A HTD) 


where, for N the set of natural numbers 





Bans BRM ifm=2neNn 
s/w OSsswisweNandwe 0) ifm=No 
(0, 1) ifm=a 


and the functions are defined on L, as 


() -wl-x 
xo y=min(1, 1+) 

(i) xv y= (299) y= maxis, 9) 
EAy= Xow) = min(s, 7) 
xB y=(¥> NA >X=1-he-yH 
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‘The introduction of these new many-valued logics was not supported by any 
separate argumentation — Lukasiewicz did not give new reasons for the choice of 
more logical values. He merely underlined that the generalization was correct since 
for n=3 one obtains exactly the matrix of his 1920 three-valued logic. Later history 
has shown, however, that the Lukasiewicz logics have many properties that make 
them among the most important logical constructions. 

First, the Lukasiewicz matrix M; coincides with the matrix of the classical logic. 
Moreover, the set (0, 1] is closed with respect to all Lukasiewicz connectives, which 
is another expression of conservative character of the generalization. Consequently, 
‘A; isa subalgebra of any algebra (Li, > VA) and M, is a submatrix of M, 
‘Therefore, all tautologics of Lukasiewicz propositional calculi are included in the 
classical TAUT 


E(M,) G E(M) = TAUT 


Next, the relations between the contents of finite matrices are established by the 
famous Lindenbaum condition (Lukasiewicz and Tarski, 1930); 


‘Theorem 14.3 For finite », m€ N, B(M,)CE(M,) iff m~1 is a divisor of 
nt 


‘The proof of the last property may be based on the ‘submatrix’ propertics, men- 
tioned above, of the family of the finite matrices of Lukasiewice. Using the same 
argument one may also prove the counterpart of theorem 14.3 for matrix conse- 
{quence relations F, of Mj: 


Theorem 14.4 For finite mm EN, Cty iff m~1 isa divisor of n~ 1. 


"The most interesting property of the infinite Lukasiewicz matrices is that they have 
the same set of tautologies, i¢., a common content, which is equal to the intersec: 
tion of the contents of all finite matrices: 


Theorem 14.5 E(My) = E(M) = (E(M,): "22, "E N} 


Lukasiewicz m-valued logics L, are not functionally complete. All of what was 
established for m= 3 applies for each finite m. First, no constant except 0 and 1 is 
definable in (In =, 9, ¥, A, %). Second, adding the constants to the stock of 
connectives makes this algebra functionally complete (compare theorem 14.2). 
And, since M, is generated, either by 1/(n- 1) or by (n-2)/(n~ 1), adding only 
‘one of them will do the job as well. McNaughton (1951) formulated and proved 
an ingenious definability criterion for Lukasiewicz matrices, both finite and infinite, 
showing the mathematical beauty of Eukasiewicz’s logic constructions. 

‘A proof that finite matrices are axiomatizable was given in Lukasiewicz and 
‘Tarski (1930). However, the problem of formalation of a concrete axiom system for 
finite Lukasiewicz logics for > 3 remained open until Rosser’s and Turquette's 
(1952) general method of axiomatization of m-valued logics with connectives 
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satisfying the so-called standard conditions. This method can be applied, among, 
others, to L, since such connectives are either primitive or definable in Lukasiewicz 
finite matrices. Hence, for every » an axiomatization of Lukasiewicz’s n-valued 
propositional calculus can be obtained. The axiomatization, however, becomes 
very complicated due to the high generality of the method given by Rosser and 
Turquerte 

In 1930, Lukasiewicz (Lukasiewicz and Tarski, 1930) conjectured that his Ry 
valued logic was axiomatized by: 


Ll psa) 

12 (p99) 29> (9> 7) 
BB (e994) >P) 
1A (99-9) 99>) 

13 (p>) (4p) 949) 


together with the rules MP and SUB. According to Lukasiewicz, this hypothesis 
was confirmed by Wajsberg in 1931.” Next comes the reduction of the axiom 
set: Meredith (1958) and Chang (1958) independently showed that axiom LS is 
dependent on the others. There are two main accessible completeness proofs of L1- 
LA (with MP and SUB): by Rose and Rosser (1958) based on syntactic methods 
and linear inequalities, and by Chang (1959) with purely algebraic methods. 

Several axiomatizations for finite-valued Lukasiewicz logics (n > 3) have been 
obtained by extending the axiom system L1-LA in different ways; see, for example, 
Grigolia (1977) and Tokarz (1974). 


14.8. Background to Formalization 


Rosser and Turquette (1952) determined the conditions that make finitely many 
valued propositional logics resemble CPC, thereby simplifing the problem of 
axiomatization (and the question of their extension to predicate logics, which we 
shall not discuss). Begin with the pattern of interpretation discussed in section 14.4, 
with matrices of the form 


May= (Us, Di) 
where 
Uy = (Bw fir--- sf) 
B12...) 


Dy= U2... 
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with n= 2€ Nand 1=&<n. The natural number ondering conveys decreasing 
degree of truth so that 1 always refers to “truth” and 1 takes the role of falsity. 
‘Next come the conditions concerning, propositional connectives, which in Ma, 
have to represent negation, —, implication, -», disjunction, v, conjunction, », equiva- 
lence, , and special one-argument connectives jj,.... je Assume that the same 
symbols are used to denote the corresponding functions of U, and that a given My, 
is the interpretation structure. Then the respective connectives are said to satisfy the 
standard condisions if for any x, y€ E, and i€ (1,2,...,) 


ED, iff xD, 

x+YED, iff xE Diand ED, 
xVYED, iff xe DooryED, 
KAYED, if x€ Dpand yED, 

Xm YED, if cither x, ye Dy or x, yE D, 
HED, if xei 


Bach matrix M,, having standard connectives as primitive or definable is called 
standard. When only some of them are present, the term “Q-standard’ is used, 
where Qis a subset of the set of all standard connectives, 

All Post and all finite Lukasiewicz matrices are standard. The first case is easy. 
Post matrices are based on functionally complete algebras, see section 14.6, and thus 
any possible connective is definable. A given w-valued Lukasiewicz matrix may be 
isomorphically transformed onto a matrix of the form M,.: the isomorphism is 
established by the mapping f(x) = m— (m—1)« of the set 


(0, yds BAM 


conto (1, 2,...,].A moment's reflection shows that original Lakasiewicz disjunction 
and conjunction satisfy standard conditions. In turn, the other required connectives 
are definable in M,. Thus, 


SSE EM) 


xm yau(e= 9 ay) 


define the standard implication and equivalence (+ appearing on the right is the 
original Lukasiewicz connective). The definability of 77s, j(x)=1 iff x= i follows 
easily from the McNaughton (1951) criterion; see also Roster and Turquette (1952). 

Using their framework, Rosser and Turquette solved the problem of axiomatizing, 
known systems of many-valued logic, including m-valued Lukasiewicz and Post logics. 
Any logic determined by a {, j,, fx --- ».)-standard matrix M,, is axiomatized by 
‘means of the rule MP and SUB and the following set of axioms: 
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AL pgp 

(2949) 227) 
(09907) 

Gi) > G2) > 9) > il) > 

GLP) 9) > Gel) 99-9 9)---)) 
ID) p for 5= 12,0058 


SRERERB 


Sar Be) > Sus 
where f=/lk 


Bos) >= +> jer) 9 JA Fle 
ar) 


1sPd))--)) 








where the symbols f and F used in A7 represent, respectively, an arbitrary function 
Of the matrix M,, and a propositional connective associated with it (Rosser and 
‘Turquette, 1952). 

‘The axiom system Al-A7 consists of the two parts: AL-A3 and A4-A7. The fist 
‘group of axioms describes the properties of pure classical implication sulficient to 
present the deduction theorem in its classical version, (ded,), cf. section 14.8. The 
second group contains formulas, which, due t0 the properties of the j connectives 
and implication, bridge semantic and syntactic properties. Checking the soundness 
Of the axioms is easy, drawing on procedures from classical logic. The completeness 
‘proof, however, requires much calculation and involves quite a complicated induction, 


14.9. Interpretation and Justification 


Problems of the philosophical interpretation and intuitive explanation of many: 
valuedness are vexed questions of logic. Through time, even those motivations that 
‘were once accepted have undergone revision. Only some algebraic interpretations 
remain untouched, mostly due to the mathematical properties and their usefulness 
for solving further formal problems. This section concentrates on selected aspects of 
the topic. 

Recall from section 14.1 how Lukasiewicz (1970b [1920]) was initially concerned 
‘with ‘future contingents’ and interpreted the value } as ‘possibility’ or undetermina: 
tion of the 0-1 status of a proposition. Yer, as Gonseth (1941) argued, this way of 
interpreting the third value is incompatible with other principles of the Lukasiewicz 
logic. Consider two propositions «and 05; whenever @ is undetermined, 50 is 1 
and then, according to the table of conjunction a—a is undetermined. This, 
however, contradicts the intuition that a» ~x« is false, independent of a’s content. 
This reveals how Lukasiewicz’s original interpretation neglects the mutual dependence 
of some ‘possible’ propositions. 

Furthermore, Haack (1978, p. 209) argued that even Lukasicwicz’s motivation in 
rejecting the law of bivalence so as to avoid the fatalist conclusion depends on a 
‘modal fallacy, inferring 
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If a, then it is necessary that B. 

from 
tis necessary that (if @, then B). 


Urquhart (1986) presents another interesting criticism of Lukasiewicz. implica- 
tion. Taking the third logical value as the set {0, 1] of two ‘potentil” classical values 
of a future contingent sentence, he defines implication in a natural way so that an 
implication having 0 as antecedent always has the value 1, an implication from 1 to 
{0, 1) has the value {0, 1 and, finally the implication from {0, 1} to (0, 1} has the 
value {0, 1}. The last point is inconsistent with the Lukasiewicz stipulation, however, 
since that would make the output be 1. Therefore, Urquhart claims, the Lukasiewicz 
table is wrong.* 

Reichenbach (1944) argued that adoption of three-valued logic would provide a 
solution to some problems raised by quantum mechanics. To avoid ‘causal anomalies,’ 
Reichenbach presents an extended version of the Lukasiewicz logic, adding further 
negation and implication connectives. He refers to the third logical value as “indeter- 
‘inate’ and assigns it to anomalous statements of quantum mechanics, primarily 
sentences indicating both the position and the momentum of a particle at a given 
time, which, according to Bohr and Heisenberg, are to be regarded as meaningless, 
‘The weak point of Reichenbach’s (1944, p. 166) proposal is that certain laws, such 
as the principle of energy, are also classified as ‘indeterminate.’ (See Haack (1996) 
for criticism of other interpretations of three-valued logic.) 

‘The mathematical probability calculus resembles a many-valued logic, and so the 
question of a connection between probability and many-valuedness naturally arises. 
{See chapter 16.] Lukasiewicz (1970a [1913]) himself invented a theory of lagical 
‘probability, whose distinguishing feature was that it referred probability to proposi- 
tions, rather than events. Reichenbach (1949) and Zawircki (1934), among others, 
continued this conception attempting to create a many-valued logic in which logical 
probability could find a satisfactory interpretation. The Reichenbach-Zawirski con- 
ception is based on the assumption that there is a function Pr ranging over the set 
Cf propositions of a given standard propositional language, with values from the real 
interval (0, 1]. The postulates for Pr are a follows: 


PL 0s Prip)s1 

P20 Pripvap)=1 

D3 rip q)= Prip)+ Pr(q) if p and q are mutually exclusive 
(ie., Pripaq)=0) 


P4 —-Pr(p)= Pr(g) when p and q are logically equivalent 


From P1-P4, itis possible to infer other expected properties of Pr. Identifying the 
logical value v(p) with the measure of probability Pr(p) then, for Pr(p) =4 from the 
properties mentioned, gives 
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fyd= Prey p= Prp)=t 


Consequently, logical probability must not be identified with logical values of any 
‘ordinary extensional many-valued logic. Giles (1974), however, presents a plausible 
interpretation of Ry-valued Lukasiewicz logic in terms of dialogue logic with risk 
values being associated with subjective probabilities. 

‘Another early goal of many-valued logicians, including Lukasiewicz and Bochvar, 
was to eliminate the Russell paradox. (See section 14.1.) The problem concems the 
Comprehension Axiom (CA), which states the existence of all sets bearing logically 
expressible properties. [See chapter 3.] Russell's discovery excludes the acceptance of 
CA in set theory based on first-order classical logic. This raises the question whether 
CA could be incorproated in Lukasiewicz logics. Moh Shaw-Kwei (1954) proved 
the impossibility of this for finite systems. Skolem (1957) hypothesized, however, 
that CA was consistent in ® valued Lukasiewicz logic. Although this is an active 
area of research, with several interesting results, the question in its full generality 
remains open? 

‘Scott (1973) presented an interesting interpretation of finite-valued logical matrices. 
Aware of the deficiency of all known interpretations of non-classical logical values, 
he proposed replacing more values by more valuations. In each case, a definite 
‘number of bivalent valuations generates a partition of the set of propositions into 
types corresponding to the original logical values ~ Scott refers to them as indexes 
Formally, valuations are arbitrary functions from the set of formulas to the two 
classical values of the truth and falsity, v,: For [TF]. An m-element set of valuations 
can thus induce maximally 2* types. The actual number of types depends on limiting, 
conditions imposed on valuations. An accurate choice of these conditions leads to a 
relatively simple characterization of the connectives of the logic under consideration. 
Applying this method, Scott obtains a description of the mvalued Lukasiewicz 
negation and implication connectives through an (n1)-clement set of valuations 
(9%. 5 Meal: He suggest that equalities of the form ‘r(a) = T” should be read as 
“(the statement) ais true to within the degree i” Thus, the numbers 0,1,...,~2 
stand for degrees of error in deviation from the truth. Degree 0 is the strongest and. 
corresponds to ‘perfect” truth or no error: All Lukasiewicz tautologies are schemes 
of the statements having 0 as their degree of error. Lankasiewicz implication may also 
be conveniently explained in these terms: The measure of error of the whole impli 


Urquhart (1973) independently suggested an interpretation motivated by the 
logic of tenses that is formally equivalent to the Scot's interpretation. Let F be a 
relation between natural numbers of the set S,= (0,1,...,#=2) and formulas, 
with ‘xt a” to be read ‘ais true at x”. Urquhart assumes that 


If xta and x<yE S,, then yb. 
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and then adapts + to particular logics, specifying m, the language, and recursive 
conditions that establish the meaning of connectives. Accordingly, each case results 
in a Kripke-style semantics having a finite number of ‘reference points’ S,. The 
meaning of elements of S, depends on the properties of the logic under considera- 
tion, For Lukasiewicz and Post logics, Urquhart suggest a temporal interpretation: 0 
is the present moment and all other points of reference are future moments, The 
interpretation of Post logic is entirely compatible with the original interpretation 
envisaged by Post himself. The temporal way of understanding Lukasiewicz negation 
and implication exhibits the sources of difficulties in obtaining plausibly intuitive 
interpretation of many-valued Lukasiewicz logic. Urquhart eventually indicates clauses 
which ‘natural’ connectives of negation and implication should satisfy, 


14.10. Applications 


Many-valued logic has always been motivated by anticipated applications. To what 
‘extent these expectations have been fulfilled is dificult to say, but this section briefly 
mentions some of the proposed concrete applications, especially for philosophical 
logic and such practical areas as switching theory and computer science. Unforuantely, 
there is not room enough here to describe them in detail, 

‘Many-valued matrices have been applied to the formalization of intensional func 
tions, to the approximation of syntactically based non-classical logics, and to testing 
the independence of the axioms of logical systems. As seen in section 14.1, Lukasiewicz 
himself sought to formalize possibility and necessity within three-valued logic, Later, 
he proposed a four-valued system of modal logic (Lukasiewicz, 1953), Although, 
from the philosophical point of view, such finite-valued interpretations of modalities 
have no particular value, since, as Dugundii (1940) proved, for no reasonable non: 
trivial modal logic is there an adequate finite matrix such that the logic coincides 
with the content of the matrix, nevertheless, their counterparts in Post algebras have 
proved crucial for the computer science applications. 

‘The matrix approach has, to some extent, been important for intuitionistic logic 
{sce chapter 11]. Although its creators were not guided by the idea of introducing 
supplementary logical values and rendering that idea axiomatically, it tums out that 
intuitionistic logic can be characterized exclusively by means of an infinite class 
of infinite-valued matrices (Jaskowski, (1975 [1936}). At an early stage, Heytin 
approximated the laws of intuitionism INT by the content of a three-valued matrix, 
now called the Heyting matrix. This approximation was refined by Gédel (1932), 
who showed that INT cannot be described by a finite matrix, nor by a finite set of 
finite matrices. An essential part of Géde!’s reasoning consists of the construction of 
4 sequence of finite matrices approximating INT, which have their own interest, 
defining interesting implicative systems. Keeping the original definitions of Géde!’s 
connectives, one may define an infinite-valued logic. Dummett (1959) showed that 
this was axiomatizable by extending INT with the axiom (p—> 4) v(q—> p). This 
version of intutionism and its relation to the original have been of special importance 
in the development of intuitionistic logic and its other formlizations. 
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‘Los (1948) applied many-valuedness in the formalization of epistemic functions, 
such as ‘John believes that p* or, more accurately, ‘John asserts that p” Such proposi- 
tions are instances of the schema ‘x asserts that 7," whose formal counterpart is a 
function Lxp with two different arguments: nominal and propositional. Consider a 
case with two persons and a set of propositions which they either accept or deny. 
‘There are then four possible evaluations in terms of pairs of classical logical values, 
icc., truth and falsity, that divide the propositions into four kinds, those that both. 
deny, that one accepts and the other denies, that the first denies and the second 
accepts, and those that both accept. These four types of proposition will correspond 
to four non-classical values. The connectives of negation and implication defined 
‘naturally’ with respect to their classical counterparts for each person, will behave 
classically. Accordingly, the content of the negation-implication fragment of the 
characteristic four-valued matrix coincides with the set of classical (—, -4)-tautolo- 
gies. On the base of this four-valued version of CPC, the belief operators are 
formalized. This may be extended to the case with any number of persons, which 
results in other formal many-valued interpretations of the classical Jogic with addi 
tional operators. Los’s construction thus shows how it is possible to obtain a many- 
valued. interpretation of some special intensional functions while simultaneously 
adhering to the intuition of bivalence. Since, however, the many-valuedness thus 
obtained depends on a certain relation between two categorially diferent arguments, 
person and a proposition, it has to be considered an atypical semantics 

‘A very natural application of many-valuedines is for the analysis of vagueness and 
inexactness and their associated paradoxes, like the Sorites Paradox (Korner, 1966; 
Williamson, 1994). This application gave rise to fuzzy set theory (Zadeh, 1965), 
and ultimately to the theory of fuzzy logics (Zadeh, 1975), which has become an 
autonomous research discipline, with use in artificial intelligence (AI), computer 
science and steering theory (Tumer, 1984, and Gottwald, 1981). 

Just as classical logic and Boolean algebras have been successfully used in switch 
ing theory and computer science for the analysis, synthesis and minimalization of 
‘multiplex networks, $0 there has been interest in the possibility of applying many: 
valued logics for similar purposes. These have resulted in several techniques for the 
analysis and synthesis of electronic circuits and relays based mainly on Moisil and. 
Post algebras (Rine, 1977). The practical switchover of two oppositely oriented 
‘contacts positioned in parallel branches of a circuit that must change theie positions 
simultaneously, is the simplest possible electronic circuit to consider within a three- 
valued framework. Since there are good reasons to drop the idealistic assumption 
that effecting the circuit, ¢g., using relays, would really change the positions of both 
contacts instantly, ic. that the circuit would pass from the state I to the state 0, 
then, obviously, there isa third state that might alo obtain. A generalization of this 
construction for any number of contacts similarly results in m states. Finally, obtaining, 
a description of networks composed of such switchovers is possible within an appro- 
priate algebraic base. For this, many-valued algebras with ‘modal’ functions - Moisi! 
algebras, ic., Lukasiewicz »-valued algebras, and Post algebras ~ appear useful. The 
‘most important advantage of the many-valued approach is the possibility of eliminat- 
ing possible switching disturbances through the algcbraic synthesis of the networks. 
(Moisil, 1966, 1972) 
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Logical many-valuedness has also been used successfully in computer science, in 
both hardware and software. Fullscale ternary computers were even completed 
twice: in the USSR (SETUN, 1958) and in the USA (TERNAG, 1973). These 
attempts, however, showed technical difficulties that were too big compared to the 
fain. So, efforts aimed at full hardware realization of many-valued computers have 
been reduced and attention directed instead towards the synthesis and construction 
of digital devices, e.g. memories, using many-valued algebras, especially Post algebras 
(Epstein et al., 1974; Rasiowa, 1977.) 


Suggested further reading. 


For a more complete discussion of the topics discussed here, sec Malinowski (1993). As 
further readings, the following items are also recommended: Bole and Borowik (1992), 
Haack (1996), Lukasiewice (1970c), Rive (1977), Rosser and Turquette (1952), Turner 
(1984) and Urquhart (1986). 


Notes 


1 "The truth tables of binary connestives * are viewed as follows che value of is placed in 
the fit vertical line, under the connective, the value of in the first horizontal line, 
beside the connective, and the value ofa» atthe intersection of the two tines. 

2 Recall that this set of connectives i ufiient to define all other casical connectives and 
thus warrants functional completeness of the underlying algebra and logic. 

3 The coincidence with Bochvar is striking. However, Hallén's work is independent and 
‘original; compare, e., Williamson (1994) 

4 Var(a) and Var(X) are the sets of variables appearing in the formula «and ia the 
formulas of X, respectively. 

Similar w element model (any "= 2) ofthe clamica logic may be provide using matrices 

having standard connectives dexcibed in section 14.8 

‘The O-ary operations are constant, i, ements of E, 

Lukasiewicz (1970, p. 144); no publication on the topic by Wasberg exis. 

Note that Urguhar’s table gives the Klene strong implication 

See Malinowski (1993, pp. 81-3) for more detailed account. 
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Chapter 15 


Nonmonotonic Logic 
John F. Horty 


15.1. Introduction 


‘The goal of a logic is to define a consequence relation between a set of formulas 
and, in most cases, an individual formula A, This definition generally takes one of 
two forms, From a proof theoretic standpoint, A is said t0 be a consequence of 
‘whenever there is a deduction of A from the set P, viewed as a set of premises; from 
4 model theoretic standpoint, A is said to be a consequence of T whenever A holds 
in every model that satisfies each formula in T. 

Although the detailed inferences sanctioned by particular logics vary widely de- 
pending on the connectives present and the properties attributed to them, certain 
abstract features of the consequence relation are remarkably stable across logics. 
‘Among these is the property of monetomicity if A is a consequence of T, then A is 
4 consequence of TU |B}. What this means is that any conclusion drawn from a set 
of premises will be preserved as a conclusion even if the premise set is supplemented 
with additional information ~ that the set of conclusions grows monotonically as the 
premise set grows. 

‘The monotonicity property flows from assumptions that are deeply rooted in both 
the proof theory and the semantics, not only of classical logic, but of most philo- 
sophical logics as well. From the proof theoretic standpoint, monotonicity follows 
from the fact that any derivation of the formula A from the premise set T” also 
‘counts as a derivation of that formula from the expanded premises set T'U |B}; the 
addition of further premises cannot perturb a derivation, since standard inference 
rules depend only on the presence of information, not its absence. The verification 
of monotonicity is, if anything, even more immediate from the model theoretic 
standpoint: since every model of F'U (B) is a model of F, it follows at once, if the 
formula A holds in every model of I, that it must hold also in every model of 
Puls. 

A nonmonotonic lagic is simply one whose consequence relation fils to satisfy the 
monotonicity property ~ where the addition of further premises can lead to the 
retraction of a conclusion already drawn, so that the conclusion set need not increase 
‘monotonically with the premise set. Although certain philosophical logics, such as 
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relevance logic [see chapter 13], could be classified as nonmonotonic in this sense, 
the phrase is generally reserved for a family of logics originating in the field of 
artificial intelligence (AL), and aimed at formalizing the patterns of default reasoning 
that seem to guide much of our intelligent behavior. 

Without attempting anything like a formal defnition, one can think of default 
reasoning, Very roughly, as reasoning that relies on the absence of information as 
well as its presence, often mediated by rules of the general form: given P, conclude 
‘Qunless there is information to the contrary. Iti easy to see why a logical account 
of this kind of reasoning requires a nonmonotonic consequence relation, Suppose, 
for example, that the generic truth ‘Birds fly” is taken to express such a default: given 
that xis a bird, conclude that x flies unless there is information to the contrary. And 
suppose one is told that Tweety is a bird. Taken alone, these two premises ~ that 
birds fly, and that Tweety isa bird ~ would then support the conclusion that Tweety 
fies, since the premise set contains no information to the contrary. But now, imag- 
ine that this premise set is supplemented with the additional information that Tweety 
does not fly (perhaps Tweery is a penguin, or a baby bird). In that case, the original 
conclusion that Tweety flies would have to be withdrawn, since the default leading. 
to this conclusion relied on the absence of information to the contrary, but the new 
premise set now contains such information. 

The field of nonmonotonic logic began in the late 1970s as an attempt to rep- 
resent this kind of reasoning within a gencral logical framework. Since then, the 
‘area has been the focus of intense activity, giving rise to hundreds of conference 
and journal papers, most of which, however, are still confined to the AI literature. 
At this point, it would be impossible to provide a balanced survey of the ficld 
in anything less than a full-length monograph. The present chapter is intended, 
instead, only as an introductory presentation of two of the main lines of approach ~ 
a fixed-point theory and a model-preference theory ~ in a way that is accessible to 
a philosophical audience, with an emphasis on conceptual rather than implementa- 
tional issues. 


15.2. Some Motivating Problems 


Here are some of the problems that led to development of nonmonotonic logics, 
namely, the frame problem, first noticed by McCarthy and Hayes (1969), what is 
Known as the qualification problem, and the problems of closed-world reasoning and 
Mefeasible inheritance reasoning. 


15.2.1. The frame problem 


‘One of the most important reasoning tasks studied within AI is that of planning — 
the problem of finding, in the simplest case, a sequence of actions to achieve a 
specified goal from a specified initial stare. Within a logical framework, the planning, 
problem is often studied from the standpoint of the situation calculus, a firs-order 
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formalism containing expressions of the form H{@, s] to represent the fact that 
the proposition ¢ holds in the situation s, and allowing also for a description of the 
effects of various actions. 

To illustrate the use of this formalism, imagine that four blocks — A, B, C, and D 
~ are arranged on a table, with blocks A, C, and D set on the table's surface, block 
B stacked on top of block A, and none of the others having anything on top of 
them, If this situation is referred to as s1, some of the relevant facts from the 
situation might be depicted through the formulas 


H{On(B, A), 31) 
‘H[Clear(B), sl} 
H{ Clear), 31) 
‘H{Clear(D), sl} 


(5) 


Which state that the proposition that block Bis on block A holds in the situation s1, 
as do the propositions thar the blocks B, C, and Dare clear. Note that expressions 
like On(B, A) and Clear(B) are treated grammatically as complex terms referring, t0 
Propositions or facts, not as sentences. 

Suppose then that these blocks must be manipulated using a robot arm that can 
perform only two primitive actions: stacking one block on another and unstacking 
‘one block from another (and placing it on the table), Let Stack(X, Y) and Unstack( X, 
Y) represent the actions of stacking X on Y and unstacking X from Y, the effects 
‘of these actions can be captured through the axioms 


(H{Clear(X), 5} « H[Clear(¥), 5] « X# ¥) 2 HlOn(X, ¥), Res (Stack(X, Y)), 1) 


(H{On(X, ¥), 5] 4 H[Clear(X), 5) > H{Clear( ¥), Res((Unstack(X, Y)), 9] 
(15.2) 


in which it is assumed that all variables are universally quantified. Where at is a 
sequence of actions, the expression Res(a, s) denotes the situation that results when 
the actions in a are executed in tur, beginning with situation s. What the first of 
these two axioms says, then, is that, as long. as the distinct blocks X and Y are both 
clear in the situation 5, the situation that results from s when X is stacked on Y is 
‘one in which X is on Y; the second axiom says that, if X is on Y and X is clear in 
4 then Y is clear in the situation that results from s by unstacking X from Y. 

Of course, these two axioms define the effects only of action sequences containing 
4 single action, the base case. The effects of longer sequences can be defined induc- 
tively by stipulating that 


Res((Ayy... 4 Ady 3) = Re(A,), Res(Ay, «Ando 9) 5.3) 





when n is greater than one; the result of executing a sequence of m actions in a 
situation sis equivalent to the result of executing the last of these actions in the 
situation that results from executing all but the last. 
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Now suppose that Iisa set of sentences containing a description of some initial 
situation 5 as well as axioms specifying the effects of the available actions and perhaps 
some bookkeeping material, such as the inductive definition of the Res function; and 
let ¢ represent the proposition desired as a goal. Then the planning problem is the 
problem of finding an action sequence a whose execution in the initial state scan be 
proved from the information in T to yield a state in which the goal proposition @ 
holds ~ more formally, a sequence «for which it can be shown that 


Tr Hl6, Rega, s)) 


where F is the classical consequence relation. 

‘As a concrete example, imagine that s1 above is the initial state, and that T 
‘contains the statements (15.1)-(15.3): the four sentences describing the initial state, 
the axioms describing the Stack and Unstack actions, and the inductive specification 
Of the Res function. Now suppose the goal is to achieve a situation in which block 
Ais stacked on top of block C ~ that is, a situation in which the statement On(A, 
C) holds. In this simple case, it is easy to find an appropriate plan: fist unstack B 
from A, then stack A on C. More formally, the appropriate plan appears to be 
(Unstack(B, A), Seack(A, C)), and it seems intuitively just thinking about how this 
sequence of actions should work ~ that it should be possible to verify the correctness 
of this plan by establishing thar 





T+ H[On(A, C), Res((Unstack(B, A), Stack A, C)),s1)] 


showing that the plan achieves its goal 

In fact, however, this result cannot be established, and it is important to see why. 
Because I’ contains the statements On( B, A) and Clear(B), one can indeed conclude 
from the Unstack axiom that 


H{Clear(A), Res((Unstack(B, A)), 31)] 


which states that the block A is clear in the situation that results from sl when B is 
unstacked from A. And because F contains H[Clear(C), s1], one knows that the 
block C was already clear in the initial state. Since A is now clear as well, it is 
reasonable to think that a goal state could now be achieved simply by stacking block 
A onto block C ~ that is, that the Stack axiom could be used to derive 


HOn(A, C), Res((Stack(A, C)), Res((Unstack(B, A)), 31))) 


from which the desired conclusion would then follow by the definition of the Res 
function. Unfortunately, this application of the Stack axiom would requite one to 
iknow, not just that C is clear in the original state, but that C remains clear also in 
the state that results from the Unstack(B, A) action ~ that is, one would need to be 
able to establish 


H[Clear(C), Res((Unstack(B, A)), s1)) (5.4) 
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a an intermediate step. 

Of course, this intermediate step seems perfectly natural from the standpoint of 
‘one’s ordinary reasoning about actions: since Cis clear in the intial state, itis natural 
to suppose that it would remain clear even after Bis unstacked from A. In fact, how- 
ever, nothing in T’ allows this intermediate step to be derived ~ and indeed, the step 
should not be derivable as a matter of logic, fori is always possible, atleast, that the 
removal of B from A does interfere with the fact that C is clear. (Perhaps blocks B 
and Dare connected by a wire in such a way that removing B from A causes Dto be 
pulled to the top of C; this possibility is consistent with the information in T.) What 
‘one has here is the notorious frame problem, originally noticed by McCarthy and 
Hayes (1969), When an action is performed, some facts change and some do not. 
How can one tell which are which, and in particular, how does one propagate those 
facts that do not change from the original to the resulting situation in a narural way? 


15.2.2. The qualification problem 


Look again at the axiom governing the Scack action. Notice that it does not state 
that X will be on Tin any situation that results from a Seack(X, 7) action, but only 
that X will be on Yas long as X and Y are distinct blocks that are both clear in the 
‘original situation, These qualifications are necessary, of course, because the robot 
arm cannot reach blocks that are not clear, and because itis impossible to stack a 
block on top of itself. 

But once these qualifications are in place, is the Stack axiom then correct? Well, 
tno, What if the block X is so slippery that the robot arm cannot pick it up? What if 
Xs so heavy that it will crush the block 7? What if Tis a bomb that will explode 
if another block is placed on top of it? The difficulty suggested by these peculiar 
considerations is known as the qualification problem: how does one arrive at an 
accurate, suitably qualified formulation of the axioms governing actions? 

‘One might respond to this problem by deciding simply to fold all the various 
possible qualifications into the antecedent of the axioms, either explicitly or impli- 
citly. In the present case, for example, one might introduce a new propositional 
constant Weird to represent the occurrence of a weird circumstance that would 
interfere with the Stack action, and then modify the axiom governing this action 
with the further precondition that no such weied circumstances occur: 


(H[Clear(X), 5] H{Clear( 1), 5]. X# Ya Weird) 
D H[Om(X, 1), Res((Seack(X, Y)), 5)] (18.5) 


‘The interfering circumstances imagined in the previous paragraph could then be 
classified, quite naturally, as weird: 
Slippery(X) > Weird 
Heary(X) > Weird (15.6) 
Bomb(Y) > Weird 
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There are, however, two problems with this suggestion. The first ~ to which I 
know of no solution ~ is that the list of circumstances that might interfere with a 
stacking action is open-ended. No conceivable list of possible interfering circum- 
stances could be complete. What if a metcor hits the laboratory and destroys the 
robot? Then the stack action would not be successful. What if there is an evil demon 
in the room that does not want to see X on Yand will knock X out of the hand of. 
the robot arm as it approaches T? 

‘The second problem is more subtie, and would arise even if there was a relatively 
exhaustive lst of qualifications. The point of placing preconditions in the antecedent 
of an action axiom is that one must verify that the preconditions are satisfied before 
concluding that the action is successful. And it does seem reasonable, in the case of 
the Stack axiom, that one should have to verify that the blocks X and must both. 
be clear before one can know that the result of stacking X on Y is successful. But it 
seems less reasonable to suppose that one must actually have to verify that all of the 
various weird circumstances that might interfere with this action do not occur ~ that 
there is no bomb, no meteor, no evil demon, and so on. It would be better to be 
able simply to assume that weird circumstances like these do not occur unless there 
is information to the contrary. 


15.2.3. Closed-world reasoning 


Suppose I ask my travel agent if United Airlines has a direct fight from Washington 
to Barcelona. The travel agent has access to a database containing flight information. 
From a logical standpoint, one can think of this database as a set of sentences of the 
form 


Connects UA354, Baltimore, Boston) 
Connects UA750, Washington, London) 8.7) 
(Connects UA867, London, Bareclona) 


and s0 on; the travel agent answers my question by drawing inferences from these 
sentences, Suppose I am told: No, there is no direct flight from Washington to 
Barcelona. How can the travel agent reach this conclusion? The airline database ony 
says what cities are connected by what fights; it does not list the cities that are not 
connected, and certainly this kind of negative information docs not follow as an 
cordinary logical consequence from the positive information provided. 

‘The answer is that the travel agent's reasoning is governed by 4 convention 
known as the closed-world assumption (Reiter, 1978), which states, in the simplest 
«ase, that all relevant positive information is explicitly listed. Because of this conven: 
tion, iti legitimate to conclude that 2 positive proposition is false whenever itis not 
explicitly present in the database; the travel agent can legitimately conclude, for 
cxample, that there is no direct flight between Washington and Barcelona simply 
becanse no such flight is listed. 
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‘The closed-world assumption applics, of course, not only to the airline database, 
but to any number of situations in which positive information is overwhelmed by 
negative information. When I look at a list of people invited to a panty, I can 
conclude, if Tam not on the list, that I am not invited to the party; when I look at 
my desk calendar, I can conclude, if there is no doctor's appointment listed for 
Thursday at 3:00, that I have no doctor's appointment at that time. Reasoning 
based on the closed-woeld assumption exemplifies the general pattern of default 
reasoning as relying on the absence of information: lacking information to the 
contrary, one can assume that there is no direct flight between two cities; an entry in 
the database provides information to the contrary. 


15.24. Defeasible inheritance reasoning 


Returning to the initial example: birds fly, Tweety is a bird, therefore Tweety flies. 
Reasoning like this is known in AI as inheritance reasoning, and was originally 
developed in response to the need for an efficient way of representing and accessing 
taxonomic information. Rather than having to list explicitly the properties of each 
individual, itis imagined that classes and properties are arranged in a taxonomic 
hierarchy, and that individuals inherit their properties from the classes to which they 
belong. It is not necessary to state explicitly that Tweety flies, since this property is 
inherited from the general class of birds. 

This kind of taxonomic reasoning has been familiar since Aristotle, and was ex- 
plored in some detail by medieval philosophers; what is new in AI is the idea that ~ 
«again, for reasons of efficiency ~ the taxonomy is often allowed to represent defeasible 
swell as strict information. An example of such a defeasible inheritance network is 
provided in Figure 15.1, known as the Tweety Triangle. Here, strict links are 
represented by the strong arrow =» and defeasible links by the weak arrow -», 50 
that the displayed network provides the following information: Tweety is a penguin; 
Penguins are birds; as a rule, birds tend to fly, and penguins tend not to. 

When these defeasible inheritance networks were first introduced, they were sup- 
plied only with a ‘procedural’ semantics, according to which the meaning of the 
representations was supposed to be specified implicitly by the inference algorithms 
‘operating on them. It was soon realized, however, that these algorithms could lead 
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Figure 15.1. “The Tweety Triangle 
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to bizarre and unintuitive results in complicated cases, and researchers felt the need. 
to provide an implementation independent account of the meaning of these network. 
formalisms. One natural idea involved providing a logical interpretation of the net- 
‘works — interpreting the individual links in the network as logical formulas, and so 
the entire network as a collection of formulas, whose meaning could then be speci 
fied by the appropriate logic. The logical interpretation of strict links, of course, 
presents no problems: a link like Tweety = Penguin, for example, could naturally be 
represented as an atomic statement, such a8 Pr, and a link like Penguin = Bird as a 
universal statement of the form Vx(Px> Bx). But there is nothing in ordinary logic 
to represent the defeasible links Bird —> Fly and Penguin ~ Fly, carrying the intuitive 
meaning birds fly and that penguins do not. 


15.3. A Fixed-Point Approach: Default Logic 


Perhaps the best known and most widely applied formalism for nonmonotonic 
reasoning is definul logic, introduced by Reiter (1980). This formalism results from 
supplementing ordinary logic with new rules of inference, known as default rules, 
and then modifying the standard notion of logical consequence to accommodate 
these new rules 


153.1. Basic ideas 


‘An ordinary rule of inference (with a single premise) can be depicted simply as a 
premise conclusion pair, such as (A/B); this rule commits the reasoner to Bonce A 
has been established, By contrast, a default rule is a triple, of the form (A: C/B) 
Very roughly, such a rule commits the reasoner to B once A has been established 
and, in addition, C is consistent with the reasoner’s conclusion set. The formula A 
is referred to as the prerequisite of this default rule, Bas its consequent, and C as its 
Justification.’ § default theory is a paix S=(W, 2}, in which Wis a set of ordinary 
formulas and ‘is a set of default niles. 

Before characterizing the new notion of logical consequence defined by Reiter, 
consider how default logic might be used to represent the initial example, in which 
‘one is told that Tweety is a bird and that birds fly. The generic statement that birds 
fly can reasonably be taken to mean something like: once one learns of an object x 
that it isa bird, one should conclude that x fies unless there is information to the 
contrary — unless, that is, this conclusion is inconsistent with one’s beliefs. What this 
suggests is that the generic statement should be represented a8 a sort of universally 
‘quantified default rule, perhaps of the form Wai Be: Fx/Fx), but unfortunately it is 
No more meaningful to quantify a default rule than itis to quantify an ordinary rule 
of inference. To avoid this problem, Reiter allows open formulas to occur in defaults, 
10 that the generalization conceming birds can be expressed as (Re: Fx/ Fx). However, 
to avoid the resulting complexities ~ involving the application of these open defaults 
to yield closed formulas — the somewhat simpler approach of representing. these 
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defeasible generalizations, not by open defaults, but instead by appropriate instance 
of these defaults for each object in the domain is adopted here. In the present case, 
where Tweety is the only object of concer, the only default necessary is (Bt: Fi/ 
Fi), which says that if Tweety is a bind, one should conclude that Tweety flies as 
long as this is consistent with what is known. The information from this initial 
example can then be represented through the default theory 4, =(W,, )), where 
= (Bah and 2, = ((Be: F/F)) 

In this example, because one knows that Br, and because Fr is consistent with 
‘one’s knowledge, the default rule justifies drawing the conclusion Fr. The appropri- 
ate conclusion set based on A, therefore seems to be Thi, Fi), the logical closure 
of what one is told to begin with, together with the conclusions of the applicable 
defaults. If one is told, in addition, that Tweety does not fly, one moves to the 
default theory A, = (Wh, 22), with Dy= 2, and W; =, U [Ft]. Here the default 
rule (Br: F4/ Ft) can no longer be applied, because its justification is now inconsist: 
ent with one’s knowledge and so the appropriate conclusion set based on 4; is 
simply TH(4), 


15.3.2. Extensions 


‘The discussion of this example illustrates the kind of conclusion sets desired from 
particular default theories. The task of arriving at a general definition of this notion, 
however, is not trivial; the trick i to find a way of capturing the meaning of the new 
component ~ the justification ~ present in default rules. 

In ordinary logic, the conclusion set associated with a set of formulas Wis simply 
TH), the logical closure of 1. It might seem, then, that the conclusion set associ- 
ated with a default theory A= (W, 2 should be 


f= TW) U {C: (A: B/C) ED, AE THW), BE THW)) 


the closure of W together with the consequents of those default rules whose prereq- 
uisites are entailed by and whose justifications are consistent with , A moment’s 
thought, however, shows that this suggestion is inadequate. For one thing, the set 
defined in this way is not even closed under logical consequence: the addition of the 
consequent from some default rule into the set £ may trigger new logical implica- 
tions that should, intuitively, be included in the conclusion set, or worse still, the 
addition of the consequent fom one default rule may trigger the firing of another. 
‘As an example, consider the default theory 4,= (3, 2) in which =] and 
D,={(A: B/C), (C: D/E)). The above definition correctly adds the consequent C 
of the first default rule into the conclusion set £. It seems, though, that the presence 
of C should then trigger the firing of the second rule, resulting also in the addition 
of E to the conclusion set, but this statement is not included. 

What this example suggests is that the definition of the appropriate conclusion set 
for a default theory should be iterative. Pethaps one should take the conclusion set 
of the default theory A= (9, 2} to be 
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B=w 
Bq = THE) U[C: (A: B/C) ED, AE THE), -BE THE} 


This suggestion responds to the previous concem, giving H(A, C, E}) as the 
conclusion set for the default theory Ay, as desired. Now, however, there is a new 
problem, illustrated by the theory 4,=(1,, 1), with W,=[A, BDC] and 
2, (A: C/B)). Tracing through the iteration, one can see that the rule (A: C/B) 
is applicable at the first stage, since its prerequisite belongs to TH(,) and its justi- 
fication is consistent with this set; hence one has B in . Just a bit of additional 
reasoning then shows that C must belong to 2, and so to £, since this formula is 
a logical consequence of the information contained in £,. The rule (A : C/B) seems 
initially to be applicable, since, prior to its application, there is no reason to con- 
‘lude 4G; bat once the rule has been applied, the information it provides docs allow 
us to derive +C. The rule thus seems to undermine its own applicability 

Of course, a chain of reasoning like this showing that some default rule is under- 
mined can be arbitrarily long; and so one cannot really be sure that a default rule is 
applicable in some context until one bas applied it, along with all the other rules that 
seem applicable, and then one has surveyed the logical closure of the result, Because 
of this, the conclusion set associated with a default theory cannot be defined in the 
usual iterative way, by successively adding to the original data the conclusions of the 
applicable rules of inference, and then taking the limit of this process. 

Instead, Reiter is forced to adopt a fixed-point approach in specifying the appro- 
priate conclusion sets of default theories - which are described as extensions In fact, 
he actually offers two characterizations of the concept of an extension, The first 
‘considered here, although not the official definition, is both more intuitive and 
‘more useful in practice. The idea behind this particular characterization is that, given 
a default theory, one first conjectures a candidate extension for the theory, and then 
~ using this candidate ~ defines a sequence of approximations to some conclusion 
set. If this approximating sequence has the original candidate as its limit, the candi- 
date is then certified as an extension for the default theory. 


Definition 15.1 The set Eis an extension of the default theory A = (1, 0) iff (if 
and only if) there exists a sequence of sets fay Ey, Ey... such that 


e=Q4, 
new 
Fu, = THE) U{C:(A: B/C) ED, AE THE), BEF) 


Here, of course, the set is the candidate, which is certfed as a true extension of 
A if it turns out that £ coincides with the union of the approximating sequence 
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Fay iy Fs... . Note that © figures in the definition of the approximating sequence 
is defined in terms of the original candidate. 

‘The fixed-point nature of extensions is more apparent in Reiter’s official definition, 
which relies on an operator [that uses the information from a particular default 
theory to map formula sets into formula sets. 


Definition 15.2. Where A=(W, 2 is a default theory and $ is some set of 
formulas, TS) is the minimal set satisfying three conditions: 


1 werys) 
2 THES) =P As) 
3. For each (A: B/C) 9, if AE T,(S) and BE S, then CE Ts). 


“The first two conditions in this definition simply state that T,(S) contains the informa- 
tion provided by the original theory, and that it is closed under logical consequence; 
the third condition states that it contains the conclusions of the default rules applic- 
able in $; and the minimality constraint prevents unwarranted conclusions from 
‘reeping in, Where A= (1, 2) is a default theory, the operator T, maps any formula 
set 5 into the minimal superset of W that is closed under both ordinary logical con- 
sequence and the default rules from D that are applicable in S. The official definition 
Of extensions ~ here presented as a theorem ~ then identifies the extensions of a 
default theory as the fixed points of this operator. 


‘Theorem 15.1 The set is an extension of the default theory A iff T,(£) = 4. 


‘As the reader can verify, the default theories 4, and A, have, as desired, the 
respective sets Th((Bt, F}) and Th((Br, ~F?)) a8 their extensions. It should be clear 
that the notion of an extension defined here is a conservative generalization of the 
corresponding notion of a conclusion set from ordinary logic: the extension of a 
default theory (9%, 1), in which 2 is empty, is simply Th(1W). And it can be shown 
also that default rules themselves cannot introduce inconsistency: any extension of a 
default theory (98,19 will be consistent as long as the ordinary component Wof that 
theory is consistent. 


15.3.3. Default consequence 


In contrast to the situation in ordinary logic, however, not every default theory leads 
10 a single extension, a single set of appropriate conclusions. Some default theories 
have no extensions; A, is an example. The easiest way to see that this theory has no 
extensions is to work with the Definition 1 of the notion, and then to suppose that 
‘A, did have an extension - say, £. Evidently, one would then have either “C€ £ or 
ACE £, Suppose, first, that “CE £. Well, since <C€ %, and under the supposi- 
tion that “CE £ it is easy to see from the definition of the approximating sequence 
that ~C€ #,, that 5C€ #5, and so on. But since © is simply the union of fo, iy 
‘%, and so on, it follows, contrary to assumption, that <C@ £. Next, suppose 
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ACE £, In that case, it is easy to see that “CE £,, and since T, is a subset of £, 
that CE ¢, which again contradicts the assumption. 

Defaule theories without extensions are often viewed as incoherent, and can per 
haps be dismissed simply as anomalous. But there are also perfectly coherent default 
theories that allow multiple extensions. A standard example arises when one tries 10 
encode as a default theory the inheritance network depicted in Figure 15.2, known 
a8 the Nixon Diamond, and representing the following set of facts: 


Nixon is a Quaker. 
Nixon is a Republican, 


Quakers tend to be pacifist 
Republicans tend not to be pacifists. 


\7 


Nason 


Figure 15.2 The Nixon Diamond 


If one instantiates for Nixon the general statements expressed here about Quakers 
and Republicans, the resulting theory is A, =(%, 2), with 


= (Qn, Rn} 
and 
De= |( Qu: Pn/Pn), (Rn: Pn/Pm)) 


‘This theory allows both Th(™, U {Pn} and TH(%, U [Pn}) as extensions. Initially, 
before drawing any new conclusions, both of the default rules from ‘D, are applica 
ble, but once one adopts the conclusion of either, the applicability of the other is 
blocked, 

In cases like this, when a default theory leads to more than one extension, it is dif 
ficult to decide what conclusions a reasoner should actually draw from the information 
contained in the theory, and several options have been discussed in the literature, 
‘One option is to suppose that the reasoner should arbitrarily select one of the 
theory’s several extensions and endorse the conclusions contained in it; a second 
‘option is to suppose that the reasoner should be willing to endorse a conclusion as 
Jong as it is contained in some extension of the default theory. These first wo 
‘options are sometimes said to reflect a credulens reasoning strategy. A third option, 
sometimes described as skeptical, isto suppose that the reasoner should endorse 
conclusion only if it is contained in every extension of the default theory.? 
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‘The first of these options - pick an arbitrary extension ~ really does seem to reflect 
«rational policy for reasoning in the face of conficting information: often, given 
such information, one simply adopts some internally coherent point of view in which 
the conficts are resolved in some particular way, regardless of the fact that there are 
other coherent points of view available in which the conflicts are resolved in a 
different way. Still although this reasoning policy is rational, itis hard to see how 
such a policy could be codified as a formal consequence relation. If the choice of 
extension really is arbitrary, different reasoners could easily select different extensions, 
for the same reasoner might select diferent extensions at different times, Which 
extension, then, would represent the consequence set of the theory? 

‘The second option ~ endone a conclusion whenever it is contained in some 
extension of the default theory ~ can indeed be codified as a consequence relation, 
but it would be a peculiar one. According to this policy, the consequence set of a 
default theory need not be closed under standard logical consequence, and, in fact, 
‘might easily be inconsistent. The consequence set of Ay, for example, would contain 
both Pn and Pn, since each of these formulas belongs to some extension of the 
default theory, but it would not contain Px a—Pn. This second option seems to 
provide a characterization, not so much of the formulas that should be believed on 
the basis of a default theory, but instead of the formulas that are believable.” 

Only the third, skeptical option ~ endorse a conclusion whenever it is contained in 
every extension of the default theory ~ results in a natural consequence relation, as 
follows. 


Definition 15.3 Let A= (W, 1) be a default theory and A a formula. Then A is 
a skeptical consequence of A ~ written, + A-~ just in case A € # for each extension 
Lot A, 


[And it is worth noting explicitly, now that a formal consequence relation has been 
defined, that it is indeed nonmonotonic in two ways: both adding new factual 
information to the component of a default theory and adding new default in- 
formation to the D-component can force one to abandon consequences previously 
supported. The first possiblity can be illustrated by referring back to the default 
theories 4, and 4, Here, 4 + FF, but itis not the case that 4; + Fr even though 4 
is obtained by adding the new factual information that Fr to the component of 
‘A, To illustrate the second case, consider the default theory A= (1, 4), where 
Y,= M, and 2, = (Qn: Pn/Pu)|; this theory is ike the Nixon Diamond 4,, except 
without the default that Republicans tend not ro be pacifist. It is easy to see that Ay 
has Th(1, U |Pn}) as its only extension, so that A, + Pm. The theory ds, however, 
has two extensions, one of which does not contain Pr; so it is not the case that 
‘A, Pm, even though As results from the addition of the new default information 
(Ru: =Pn/—Pn) to the D-component of Ay. 





153.4. Examples and non-normal defaults 


Now, how can the motivating examples from section 15.2 be handled from the 
perspective of default logic? 
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To begin with, the frame problem appears to have a straightforward solution that 
results when one supplements the standard logical description of the initial situation 
and the available actions with default rules which simply say that facts tend to 
persist. To illustrate, one might encode the problem from section 15.2.1 into the 
default theory dy = (18, D,), as follows. First the factual component 85 contains the 
formulas (15.1) through (15.3), describing the inital situation, che axioms charac 
terizing the effects of the Seack and Unstack actions, and the inductive description of 
sequences of actions. Second, the default component D; contains all instances of the 
default rule schema 


(HIe, 41: HO, Rea, 1/H1@, Resa, #))) 


which states that: whenever a fact @ holds in a situation 5, if it is consistent to 
conclude that ¢ still holds after the performance of the action a, then one should 
conclude by default that ¢ stil holds after the performance of a. 

It is easy to verify that this default theory has a single extension containing the 
formula (15.4), which is, of course, the intermediate step that was not derivable 
earlier without the help of frame axioms. Although the proposition that block C is 
still clear even after B is unstacked from A does not follow from the factual informa- 
tion contained in (15.1) through (15.3) alone, it can be derived with the help of the 
default rule which says to conclude, unless there is information to the contrary, that 
facts tend to persist.” 

‘Turing to the qualification problem, again a partial solution can be found using 
default logic by supplementing the statement of the axioms governing actions with 
default rules which say simply that peculiar circumstances that might interfere with these 
actions tend not to occur. In the case of the example from section 15.2.2, the relevant 
information might be formulated through the theory d= (14, )), in which 9% con- 
tains, in addition to the appropriate background information, the modified Stack axiom. 
(15.5) as well as the specifications from (15.6) of the various weird circumstances 
that might interfere with that action, and in which contains the single default 





(Ts Weird/Weird) 


which says to assume, absent information to the contrary, that no such weird cir 
‘cumstances occur (T stands for the universally true proposition). Of course, this 
representation does not help to resolve the first of the two issues presented by the 
‘qualification problem ~ that the list of conditions that might interfere with the Senck 
action is open-ended. The representation does, however, offer a resolution to the 
second of these issues. Given a list of various peculiar conditions that might conceiv- 
ably interfere with the Stack action, one no longer actually verifies that each of these 
conditions fails in order to conclude that Stack has the desired effects; the default 
rile allows one simply to assume that these conditions fail unless there is informa- 
tion to the contrary 

Like the frame and qualification problems, the difficulties presented by closed- 
world reasoning also scem to be amenable to a solution based on default logic. AS 
an initial suggestion, one might represent the information from section 15.2.3, for 
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cxample, through the default theory A,=(3s,, 24), with 9% containing the factual 
data from (15,7) and 2, containing each instance of the default rule schema 


(7: sConmects(x, y, 3)/Commects(x, ¥,2)) 


which says that, in the absence of information to the contrary, one should assume 
that cities are not connected by a direct flight. This theory will then have a single 
extension, allowing one to conclude (under reasonable assumptions, such as that all 
existing flights are named) that there is no direct flight between Baltimore and 
Barcelona. 

Now, step back and notice a common feature in our default logic representation 
of these various examples illustrating the frame problem, the qualification problem, 
and closed-world reasoning, as well as in our representation of the Nixon Diamond, 
Each of these cases relied entirely on default niles of the special form (A : B/B), in 
which the same formulas occurs as both justification and conclusion. Such default 
rules are known as normal defaults, and theories containing only normal defaults as 
normal default theories. As shown in Reiter (1980), normal default theories possess 
4A number of attractive properties that are not shared by default theories in general ~ 
‘most notably, normal theories are guaranteed to have extensions. Because of these 
attractive properties, and because, as has been seen, many important examples can 
bbe coded into normal theories, Reiter originally conjectured that the full expressive 
power of default logic might not be needed in realistic applications, and it could be 
limited to normal theories. 

‘This conjecture, however, was soon seen to be incorrect, as is illustrated by 
considering the final example ~ the Tweety Triangle from section 15.2.4. Consider- 
ing only normal defaults, the information from the Tweety Triangle is naturally 
represented in the theory Aye = (Wo, a) with Wy containing the sentences Pr and 
‘Wx(Px‘D Be), stating that Tweety is a penguin and that all penguins are birds, and 
with 29 containing the defaults (Br: Fr/Ft) and (Pr: Ft/—Fi), instantiating for 
‘Tweety the generic truths that binds tend to fly and that penguins tend not to, This 
default theory, like the representation of the Nixon Diamond as 4, contains two 
conflicting default rules, and so leads to two extensions: 


TW, ULF) — and TH(M4hy U (Fe) 


But is this right? In the case of the Nixon Diamond, the multiple extensions are 
reasonable, since the defaults concerning Quakers and Republicans appear to carry 
‘equal weight. But in the case of the Tweety Triangle, it really does seem that the 
default conceming penguins should be preferred to the default conceming birds, 
since penguins are a specific kind of bird, and it is always best to reason on the basis 
‘of the most specific information available. One way of capturing such preferences 
among defaults ~ first explored by Etherington and Reiter (1983) — is to modify the 
representation so that the reasons that might override the application of a default 
rule are explicitly built into the statement of that rule. Following this approach, the 
default concerning birds from the Tweety Triangle, for example, could be represented, 
not by the normal default rule (Br: F/F), but instead by the non-normal rule 
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What this rule say is that, once itis known that Tweety isa bid, if it is consistent 
with what is known that Tweety fies and that he is not a penguin, then one should 
presume that he flcs. 

‘This appeal to non-normal rules solves the inital problem presented by the Tweety 
Triangle: when this new, non-normal default is substituted for its normal predeces: 
sor in the previous Ay, the resulting theory now has only the single extension 
TH U |-sF}), which states unambiguously that Tweety docs not fly. Only the 
default rule (Pr!Fr/-sFe) can be applied. The new defaule (Br: | «—Pt]/Ft) 
does not come into play, since Pr is known, 

‘Unfortunately, in solving the previous problem, the strategy of using non-normal 
rules to express preferences among competing defaults from defeasible inheritance 
networks now introduces a new difficulty: the new mapping of information from 
inheritance networks into default cules is holistic ~ the translation of a particular 
‘statement can vary depending on the context in which it is embedded. To illustrate, 
suppose one was to supplement the Tweety Tangle with the additional information 
that another class of birds ~ say, very young birds ~ does not fly. OF course, one 
‘would then have to add to the representation the formula ¥x( 7x By), which states 
that young birds are birds, as well as the defaule (TF: “Fr/—F), instantiating for 
‘Tweety the statement that young birds tend not to fy. But in addition, since there 
is now another possible reason present for overriding the default that birds tend to 
fy, the presious representation of that default must aso be replaced with the new rule 





(Be: [Fra sPras1t]/ Ft) 


From a computational point of view, this consequence is unattractive because it 
makes the process of updating a body of information extremely complicated, invol: 
ving, not only the representation of new information, but also the reformulation of 
information that was already represented. From a philosophical point of view, the 
‘consequence is unattractive for much the same reason that holism is generally unat- 
‘tractive: the meaning of the statement that birds tend to fly seems not to vary from 
context to context, and so it is odd that its translation should vary. 


15.4, A Model-Preference Approach: Circumscription 


It was noted in the introduction that the monotonicity property reflects both proof 
theoretic and mode! theoretic assumptions of ordinary logic. Default Jogic results 
from a modification of the usual proof theoretic assumptions, introducing, rules of 
inference that depend on the absence as well as the presence of information. This 
section now turns to a theory that results from a modification of the usual semantic 
assumptions. 

‘Typically, a formula A is said to be a semantic consequence of a set of formulas 
1 — written, T+ A ~ when A is true in every model of I. For many applications, 
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however, one does not really care about all the models of F, but only about certain 
preferred models, and it then seems reasonable to modify the usual notion of con- 
sequence so that A is said to be a consequence of T whenever A is truc in all the 
preferred models of T, The theory of circumscription, originally formulated by 
‘McCarthy (1980), results from this general preferential framework when the preferred 
‘models are defined as those in which certain predicates have minimal extensions. 


154.1. Predicate circumscription 


Taking a model as a pair M=(D, »), with Da domain and » an interpretation of 
some fixed background language over that domain, begin by defining more precisely 
the preference ordering on models that forms the semantic background for the 
theory of circumscription. The general idea is that one model is at least as preferable 
as anoither just in case, while agreeing on everything else, the first assigns to some 
particular predicate P an extension at least as small as that assigned by the second. 


Definition 15.4 Where 4, = (Dj, »,) and M4, =(D,, »,) are models and where P 
ina predicate, then M; =p M just in ease 


) O20 
(i) (Q)= »x(Q) for every linguistic symbol Q other than P, and 
(iii) (P)C 9 


It should be clear that the weak preference relation <, is a partial ordering, so that 
«4 corresponding, strong preference relation is definable in the standard way. 


Definition 15.5 Where é, and 4 are models and where Ps a predicate, then 
AM, <p M, just in case Mp My but 4, # A. 


‘And one can then define the minimal elements in a class of models ~ the most 
preferred clements ~ as those models from the class for which the class contains no 
‘model that is more preferred. 


Definition 15.6 Let be a set of models and P a predicate. Then afin P- 
‘minimal in X justin case ME Kand there is no ME Xsuch that WM <, 


Suppose [I is the model class of T, the set of models that satisfies each member 
of T. Having identified the minimal, or most preferred, models in a class, one can 
now define McCarthy’s original notion of preferential, or minimal, consequence by 
focusing only on the minimal models of a theory, defining a formula as a con- 
sequence of the theory whenever itis true in all those models. 


Definition 15.7 Where I is a set of formulas, P a predicate, and A a formula, 
‘Ais said to be a P-minimal consequence of F— written Tt, A~ justin case ME A 
for every model ¢ that is P-minimal in the set | 
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‘And it is easy to see that this notion of minimal consequence is nonmonotonic. AS 
an example, take T= (Pa, a# 6). Then T #p—Pb, since the P-minimal models of T 
are those in which P holds only of the single clement a, but of course one docs not 
have I U { Pb) #» Pb. 

In addition to defining the notion of minimal consequence, McCarthy provides a 
sound second-order syntactic characterization of the idea through the axiom of 
circumscription, for which some preliminary notation is needed. Where P and Qare 
mary predicates, ake P= Q.as an abbreviation of the formula 


Vay (PRD Qe) 
Likewise, P< Q abbreviates 

P= Qn4Q=P) 
and P= Qabbreviates 

P= Q0Q5P 


Where Iisa finite theory, "stands for the conjunction of the members of F, and 1?” 
stands for the resul of substituting the predicate P forthe predicate Q throughout P. 

Using this notation, the circumscription formula for the predicate P in the theory 
TP ~ abbreviated Cire’, P| ~ can be expressed quite simply through the second 
coder sentence 


Pai Arm 


‘Any model ¢ that satisfies the first conjunct of this formula, of course, is a model of 
TT, But what does the second conjunct say? Well, if there were another model 4 also 
satisfying T and such that a0 <p 4 one could then use the value assigned by af to 
the predicate P to show that 3 satisfies the formula 3P'[17” » P* <P], The force 
Of the second conjunct, then, is simply that there is no such model 9, and so 
together, what the two conjuncts say is that Cire{U P] is satisfied by exactly the 2 
minimal models of F 


‘Theorem 15.2 _Let Ibe a finite set of sentences, P a predicate, and Ma model. 
Then M+ Cire{P; P] just in case A¢is P-minimal in | 





From this result, the soundness of circumscription with respect to minimal con- 
sequence follows at once 


‘Theorem 15.3 Let f be a finite set of sentences, P a predicate, and A a 
formula. Then Pp A whenever Cire{T P] +A. 


‘The argument is again straightforward. To say that [tp A is to say that every P- 
minimal model of T satisfies A, so let be such a model. From the preceding result, 
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it is known that ¢ Cire{T Pl. Since Cire{T; P+ A, the soundness of second- 
order logic says that Cire[['; P| A, and so one can conclude that 9 A. 

‘Of course, circumscription is not complete with respect to minimal consequence; 
not every minimal consequence of a theory can be derived from the circumscription 
formula. But this failure is no surprise, following from the incompleteness of second- 
‘order logic itself It was also noticed early on that the result of circumscribing certain 
predicates even in consistent theories might lead to inconsistency; a simple example, 
due to Etherington et al. (1985), results when one considers the theory T;, containing 
the sentences 


Ax[Nea Vy(Ny > x# s9))] 
Wx(NxD NAx)) 
Vay s(x) = 59) 2 ¥= 9) 


Any model 9 of I, must assign to N an extension containing a series isomorphic 
to the natural numbers (with s interpreted as successor); and one can then define 
another model 4¢ of TF simply by deleting from the extension of NV the initial 
clement of this series. Evidently, then, 9 <y Mf, and so the model class of [ has no 
Neminimal elements. Since, as has been seen, Cire{T'; N) is satisfied by all and only 
the N-minimal elements of this model class, i follows that the result of circamserib- 
ing the predicate N in the theory T> is not satisfiable. 

To illustrate the use of the circumscription formula, consider how circumscribing 
the predicate Pin the earlier example of Fallows one to derive Pb, To begin with, 
itis most convenient to express the circumscription formula Cire{T); P], not exactly 
in the fashion displayed above, but instead in the logically equivalent form 





Ta vP ltl? s Ps P)> =P] 


‘The second conjunct of this formula can then be instantiated by identifying P’ with 
the predicate Ax(x= a), in which case it is easy to see from the ordinary logic of 
identity that both the formulas 77” and P’ = P are derivable from T,. ‘The second 
conjunct therefore allows us to derive the formula P” = P- that is, Vx(ix(x= a) = Px) 
= and from this one can conclude at once that +P, since T, contains the informa: 
tion that a b 


15.4.2. Variable circumscription 


‘The inference relation defined by the theory of predicate circumscription allows 
‘one, for example, to formalize the kind of closed-world reasoning illustrated in 
section 5.2.3 by circumscribing the extension of the predicate Connects, one could 
then conclude that there is no direct fight connecting Washington and Barcelona. It 
‘tums out, however, that this theory is of severely limited applicability for the simple 
reason that it never allows new positive conclusions to be drawn by default. 
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This failure can be illustrated by retuming again the initial example. Given the 
information that Tweety is a bind and that birds fly, how could one use the theory 
Of circumscription to reach the conclusion that Tweety flies? It was suggested by 
McCarthy that defaults might naturally be represented in the theory through an 
appeal to explicit abnormality predicates. Where the predicate AB stands for abnor- 
ality with respect to Aying, for example, the statement that birds fly might be 
represented through the formula Vx( (Bit ~ABx) > Fx) ~ saying that all birds that 
are not abnormal in this respect fy. Suppose T contains this statement as well as Br 
then it might seem thar one should be able to reach the conclusion Fr simply by 
circumscribing the predicate AB, ensuring that there are no more abnormal birds 
than necessary. 

In fact, ths isa reasonable idea, but it fails for technical reasons, as can be seen by 
considering the model (= (D, »), with = [¢), »(B) = |¢], AB) = {#], and (F) =O. 
Of course, af does not support the statement Fr, bat it tums out that it is an 
AB-minimal model of T, The only way of decreasing the extension of the predicate 
AB, while still modeling F, would result in increasing the extension of the predicate 
F- but this violates clause (ii) of definition 15.4, which tells us that models involved 
im a preference ordering with respect to a particular predicate must agree in their 
treatment of all other predicates. 

Recause of this problem, McCarthy (1986) elaborated the basic theory of predi 
‘ate circumscription into a more flexible theory of variable circumscription, which 
‘orders models with respect to a pair of predicates, P and Z. The idea is that those 
models are preferred that minimize the extension of P while agreeing. on everything 
‘else, with the possible exception of the predicate Z, whose extension is allowed to 
vary, 





Definition 18.8 Where 44, =(D,, %:) and 4; =(D,, )) are models and where P 
and Z are distinct predicates, then M, Spr A just in case 





@) B=, 
(i) %(Q)=»(Q) for every linguistic symbol Q other than P and Z, and 
it) %(P) S HAP) 


‘This weak preference ordering i reflexive and transitive, but it isnot anti-symmetric, 
since it is possible for distinct models, agrecing in their interpretation of every 
predicate but Z, to bear the =p, relation to one another. Still, one can define a 
corresponding strong preference ordering between models by requiring the weak 
‘ordering to hold in only one direction 


Definition 15.9 Where 94 and 9% are models and where P and Z are distinct 
Predicates, then fj <p2M% just in case Mj Sax MG and it is not the case that 
MS 92M, 


‘And then the pattern set out above can be followed in defining the PZ-minimal 
models in a class, and the corresponding notion of consequence. 





355 - 





John F. Horry 


Definition 15.10 Let Xbe a set of models and P and Z distinct predicates. 
‘Then is P.Z-minimal in Xjust in case ME Kand there is no ME Xsuch that 
M <p M. 


Definition 15.11 Where Iisa set of formulas and A a formula and Pand Z are 
distinct predicates, A is a PZ-minimal consequence of T — written Tn; A — just 
in case M0 A for every A¢ that is PLZ-minimal in the set [T. 


‘These ideas can be illustrated by returning once again to the initial example. As 
already seen, the formula Fr is not an AB-minimal consequence of T, since the 
‘model 44 defined above is AB-minimal in the mode! class of F, but does not support 
this statement. One can now, however, define the model ¢ = (2, »') like M except 
that »'(AB) =O and v'(F)= |e}. Its then easy to see that MC <qqy M, 0 that Mis 
not AB,F-minimal, that AC is itself ABFminimal, and that every ABFminimal 
model of T supports the statement Fr, $0 that now Ty any Fi 

[As before, a sound second-order syntactic characterization of the notion of 
‘PZ-minimal consequence can be provided through the following circumscription 
formula, abbreviated Cire{T P, Z] and expressing the result of circumscribing the 
predicate Pin the theory F while allowing Z to vary: 


Paar, ZF a P< Pl 


‘And again, the variable circumscription formula Cire[T; P, Z] can be seen to hold in 
exactly the PAZ minimal models of the theory T, from which it follows immediately 
that variable circumscription is sound with respect to P.Z-minimal consequence. 


‘Theorem 15.4 Let F be a finite set of sentences, P and Z distinct pre- 
dicates, and aa model. Then MF Cire(T; P; Z] just in case Mis PZ minimal 
in 


‘Theorem 15.5 Let I be a finite set of sentences, P and Z distinct predicates, 
and A/a formula, Thea Ppp A whenever Cire{T; PZ] A. 


‘The application of this new variable circumscription formula can be illustrated 
through the initial example, deriving Fr from I, by circumscribing AB while allow: 
ing F to vary. As before, begin by rewriting Cir; AB; F] as 


Ta vPz'[(EP"7" P< AB) P= AB) 


‘Then, the second conjunct of this formula can be instantiated by identifying P* with 
the empty predicate Ax(x# x) and identifying Z’ with Ax(x=#). It is a straightfor- 
ward matter, using the information from T,, to verify both TY"? and P’= AB, 
and so one can conclude that P’ = AB ic., that Vx(e(x#x)x = ABy). From this 
it follows at once, of course, that Br, which allows one to concinde, again using 
the information from T, that Fr 
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15.4.3. Parallel and prioritized circumscription 


‘The theory of circumscription set out here has been generalized in a number of 
ways. Two are sketched ~ parlle! circumscription, which allows several predicates to 
be circumscribed at once, while several others vary; and prioritized circumscription, 
which allows some predicates to be circumscribed with higher priority than others. 

In fact, the theory of parallel circumscription is best seen simply as a notational 
claboration of the previous theory. Suppose that, while allowing XC Y to carry its 
usual meaning when X and Y are sets, this notation is generalised 0 that, when 
X= Xj... Xeand Y= ¥;,..., Yqare w-tuples of sets, XC ¥ means that X,C Y, 
for each i between 1 and x. Suppose also that, where P= P,,..., Py is a tuple of 
predicates, » P) represents the tuple »(P,),..., P,) of extensions assigned to these 
predicates by the interpretation ». And finally, suppose that, where P= P,,..., P, 
and Q= Q,,..-+ Quare m-tuples of predicates, with each P, taking the same number 
of arguments as the corresponding Q,, let P= Q mean P, = Q,a-++AP,S Qu 
and take P< Q and P= Q.to be defined as before 

‘Once these notational enhancements are in place, the theory of parallel circum 
scription can be presented just as before - in definition 15.8 through theorem 1.5.5 
= with the sole exception that now P and Z must be disjoint tuples of predicates 
instead of distinct individual predicates: rather than looking at models in which the 
individual predicate P is circumscribed, look at models in which the various predi- 
cates belonging to the tuple P are circumscribed in parallel 

To illustrate this theory, return to the Nixon Diamond from figure 15.2, here rep- 
resented through the theory I, containing the statements Qn and Rn, saying that 
Nixon is a Quaker and a Republican, as well as the statements 


Vx((Qe ABs) > Ps) 
and 
Vx( Ree ~AB,s) 3 Ps) 


saying that Quakers that are normal in one respect are pacifists, and that Republicans 
‘normal in an another respect are not. To decide whether to conclude that Nixon is 
«pacifist, it seems reasonable to minimize both sorts of abnormality in parallel, while 
allowing the predicate P to vary ~ focusing, that is, on the AB, ABjPminimal 
‘models. The reader can then verify that I has one AB,, AB,<P-minimal model that 
assigns an empty extension to AB, and supports the conclusion Pn, as well as 
another that assigns an empty extension to AB, and supports the conclusion —Py. 
Since neither Pw nor Pn is supported by all AB,, ABsP-minimal models of T, 
‘one can conclude that neither formula is an AB, ABs:P-minimal consequence of 
this theory. And by the soundness of circumscription with respect to minimal con: 
sequence, one can conclude also that neither Px nor Pn can be derived from the 
parallel circumscription formula Cird T's AB,, ABs; P 

In the case of the Nixon Diamond, it docs seem reasonable to minimize the 
abnormalities associated with Quakers and Republicans in parallel; but in other 
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‘cases, when defaults have different degrees of strength, it is more natural to assign a 
higher priority to the minimization of some abnormalities than others. An example 
is provided by the Tweety Triangle, from figure 15.1, which can be represented 
through the theory T, containing the statements Pr and Vx( Px > Rx), saying that 
‘Tweety is a penguin and that all penguins are birds, as well as the statements 


‘Vx(( Be 4 AB,x) > Fx) 
and 
Wx(( Pea AB3x) 2 Fs) 


saying that birds normally fly but that penguins normally do not. Here, if one 
minimizes the two abnormalities in parallel, again, as in the Nixon Diamond, there 
are some minimal models supporting the formula Fr supported and others support: 
ing —.Fr, so that one is unable to draw any conclusions. It seems more natural, 
however, to minimize the abnormality associated with penguins with a higher pri 
ority than that associated with birds, so that all minimal models then support the 
desired conclusion -Fr. 

To develop the theory of prioritized circumscription leading to this result, first 
define the relation 


(Xi, ADEM YD 
to mean that 


() XG Yi and 
(i) if X)= ¥; then XC Ys. 





Although this new relation can actually be taken — using the enhanced notation just 
introduced in connection with parallel circumscription — as holding between pairs of 
tuples of sets, things can be kept simple by reading it as a relation between pairs of 
sets, and use it to define the following preference ordering on models. 





Definition 15.12 Where 14, = (0), ») and 4, =(2,, 5) are models and where 
P, Qand Z are distinct predicates, then M4, ~ p02, justin case 

@ m=% 

(R) = »4(R) for every linguistic symbol R other than P, Q, or Z, and 
Gil) (CP), (QD EAP), 2(Q). 


‘The idea behind this weak prioritized ordering is that those models are preferred 
that minimize the extensions assigned to both the predicates P and Qwhile allowing 
Zo vary, but that minimizing Pis assigned a higher priority than minimizing Q, 

Once this weak prioritized preference ordering has been defined, the development 
of the theory follows the pattern set out earlier. A corresponding strong. ordering 
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can be introduced as in definition 15.9, with a <po7 % taken to mean that 
1M, Sp-g.2 Mj and it is not the case that MG =pq7%. The minimal elements of a 
class of models can then be defined as in definition 15.10, with 3 taken as P> Q; 
Zminimal in the class Xwhenever belongs to Kand there is no Af from Xsuch 
that 8 <p.g M6 And the appropriate notion of consequence can be defined as in 
definition 15.11, with A taken to be a P>Q; Zminimal consequence of T - 
written, Ftp gz A — whenever (+ A for each P> Q; Zminimal model 4 from 
IV], With these definitions in hand, the reader can then verify that Ty # ax-aaur “FE 
~ ice, that the statement ~Fr follows as a consequence of T; when the predicate AB, 
is minimized with a higher priority than AB, allowing F to vary. 

‘Turning to the proof theory for prioritized circumscription, begin by defining (P;, 
P)) = (Q,, Q:) as an abbreviation of the statement 





PS QA(P= QI = Qs) 
and then taking (P,, P;) < (Qs, Q:) to mean that 

(Pry BY $ (Qs, Q) 4 M(Qs, Q) <(F, BD) 
‘The circumscription formula for minimizing P with higher priority than Q in the 


theory F while allowing Z to vary, abbreviated as Cire[T; P> Q; Z}, can now be 
expressed through the second-order statement 





Paar, OY, ZIP OR a (P,, B)<(Qy, 1 


Analogues to theorems 15.4 and 15,5 can be established, saying that Cir’, P> Q; 
Z| holds in exactly the P> Q.Z-minimal models of T, and therefore, that prioritized 
circumscription is sound with respect to the appropriate prioritized notion of min: 
imal consequence. And the interested reader can verify that —Fr is indeed derivable 
from the formula Cire{Ty; AB; > ABy; Fl 

It should be clear that the theories presented here of parallel and prioritized 
circumscription can be combined and generalized, so that groups of predicates can 
bbe minimized in parallel, but all with higher priority than other groups of predicates. 
‘One could, for example, speak of the P,, P, > P, > Py Py; 25, Z,-minimal models as 
those obtained by minimizing the predicates P, and P, in parallel with higher 
priority than P,, which is itself minimized with higher priority than P,, and P,, all 
the while allowing Z, and Z; to vary. Note, however, that — just as with default logic 
~ it is still necessary to specify the preferences among various competing defaults by 
hhand, in this case by explicitly tailoring the priorities involved in the minimization 
ordering, rather than coding these preferences into non-normal default rules. 





Suggested further reading 


‘Many of the original papers on noamonotonic logic are reprinted in Ginsberg (1987). A 
‘more recent collection is Gabbay eta. (1994), which cootains several valuable survey articles 
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fon diferent approaches. There have been 2 umber of variations on the general themes 
introduced in Reiter's default logic; the most readable and comprehensive presentation of 
these is Delgrande et al. (1994). Another fixed-point theory of nonmonotonic reasoning, 
closely related to default logic, i the modal approach of McDermott and Doyle (1987 
[1980)). This modal approach was refined in Moore (1985); relations to defaule logic are 
‘stablished in Konolige (1988 [1987]). The best general survey of the theory of circumscip 
tion is Lifschitz (1994). Different model-preference approaches, based on different preference 
orderings can be found in Kauta (1986) and Shoham (1988). A general study of nonmonotonic 
‘consequence relations, with a special emphasis on model preference logics, was initiated by 
‘Makinson (1989) and Kraus etal. (1990). 


Notes 


1 Tust as ordinary inference rues allow mulkiple premise, default rules allow multiple pre- 
requisites and ako makiple justifications; we limit our attention to default rules in which 
Prerequisites and justification are unique for ease of exposition. 

2. The use of the eredulow/skepical terminology to characterize these two broad reasoning 
strategies was fint introduced in Touretzky etal. (1987), but the distinction is older than 
this; ie was noted already by Reiter, and was described in McDermott (1982) as the 
distinction between brave and cauriow reasoning. 

3. Reiter provides 2 proof procedure, sound and complete under certain conditions, for 
determining whether a formula is believable in this sense on the basis ofa default theory 
A different interpretation of this second credulous option is provided in Horty (1994), 
which interprets default logic as deontic logic allowing for moral conflicts 

4 Unfortunately, although the treatment of the frame problem suggested here does seem 
to work for the simple example se¢ out in section 15.2.1, it was shown in Hanks and 
MeDermott (1987) that this straightforward kind of noomonotonic approach delivers 
anomalous results in situations that are oly slightly more complicated. Since then a num- 
ber of more sophisticated encodings of ations and their effects in various nonmonotonic 
logics have been explored, such as those of Lifichita (1994) and Morgenstern and Stein 
(1988), as wel as renewed attempts to resolve the frame problem in ordinary monotonic 
logis, such as that of Reiter (1992). The field is now an area of active research; a recent 
survey can be found in Shanahan (1997). 
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Chapter 16 


Probability, Logic, and 
Probability Logic 


Alan Hajek 


The true lagic of the world is in the calculus of probabilities. 
James Clerk Maxwell 


16.1. Probability and Logic 


‘Probability logic’ might seem like an oxymoron, Logic traditionally concerns mat- 
ters immutable, necessary and certain, while probability concerns the uncertain, the 
random, the capricious. Yet our subject has a distinguished pedigree. Ramsey begins 
his clasic “Truth and Probability” (1980 [1931]) with the words: “In this essay the 
‘Theory of Probability is taken as a branch of logic ....” De Finetti (1980) speaks of 
“the logic of the probable.” And more recently, Jetfrey (1992) regards probabilities 
as estimates of truth values, and thus probability theory as a natural outgrowth of 
two-valued logic ~ what he calls “probability logic.” However the point is put, 
probability theory and logic are clearly intimately related. This chapter explores 
some of the multifarious connections between probability and logic, and focuses on 
various philosophical issues in the foundations of probability theory. 

‘The survey begins in section 16.2 with the probability calculus, what Adams 
(1975, p. 34) calls “pure probability logic.” As will be seen, there is a sense in which 
the axiomatization of probability presupposes deductive logic. Moreover, some authors 
see probability theory as the proper framework for inductive logic ~ a formal appa- 
ratus for codifying the degree of support a piece of evidence lends a hypothesis, or 
the impact of evidence on rational opinion. 

Fixing a meaning of ‘probability’ allows more specific connections to logic to be 
drawn. Thus section 16.3 considers various interpretations of probability. According 
to the classical interpretation, probability and possibilty are intimately related, 50 
that probability becomes a kind of modality. For objective interpretations such as 
the frequency and propensity theories, probability theory can be regarded as providing 
the logic of ‘chance’. Under the subjective (Bayesian) interpretation, probability can 
bbe thought of as the logic of partial belief. And for the logical interpretation, the 








362 


Probability, Logic, and Probability Logic 


connection to logic is the most direct, probability theory being a logic of partial 
‘entailment, and thus a true generalization of deductive logic. 

Kolmogorov’s axiomatization is the orthodoxy, the probabilistic analogue of clas: 
sical logic. However, a number of authors olf rival systems, analogues of “deviant” 
logics, as it were. These are discussed in section 16.4, noting some bridges berween 
them and various logics. Probabilistic semantics is introduced in section 16.5. The 
conclusions of even valid inferences can be uncertain when the premises of the 
inferences are themselves uncertain. This prompts Adams’ version of ‘probability 
logic,’ the study of the propagation of probability in such inferences. This, in turn, 
motivates the discussion in section 16.6 of the literature on probabilities of condi- 
tionals, in which probability theory is used to illuminate the logic of conditionals, 
[See also chapter 17.] 

‘One cannot hope for a complete treatment of a topic this large in a survey this 
short. The reader who is interested in pursuing these themes further is invited to 
consult the Suggested Readings and the References at the end. 


16.2. The Probability Axioms 


Probability theory was inspired by games of chance in seventeenth-century France 
and inaugurated by the Fermat-Paseal correspondence. Theit work culminated in 
the publication of The Port Royal Lagic, which offered a “logic of uncertain expectation’ 
in Jeffrey's (1992) phrase. However, the development of the probability calculus 
hhad to wait until well into the twentieth century. 

Kolmogorov (1950 {1933]) begins his classic book with what he calls the “ele- 
mentary theory of probability”: the part of the theory that applies when there are 
only finitely many events in question. Let © be a set (the “universal set’). field (OF 
algebra) on Q2is a set of subsets of @ that has Q as a member, and that is closed 
under complementation (with respect to ©) and finite union. Let & be given, and 
let 7 be a field on Q. Kolmogorov’s axioms constrain the possible assignments of 
numbers, called probabiliries, to the members of #. Let P be a function from to (0, 
1) obeying: 


Axiom 1 (Now-negativity) P(A) = 0 for all AE F 

‘Axiom 2. (Normalization) — P(Q)=1 

Axiom 3 (Finite additivity) P(AU B)=P(A)+P(B) for all A, BE F such 
that ANB=O 


Call such a triple of (@, 4, P) a probability space. 

Here the arguments of the probability function are sets, probability theory being 
thus parasitic on set theory. One could, instead, attach probabilities to members of 
collection 5 of sentences of a language, closed under finite truth functional combina- 
tions, with the following counterpart axiomatization: 

I P[A)=0 forall AES. 
IL If T isa taucology (of classical logic), then P(T)=1 
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UL P(Ay B)=P(A)+P(B) forall AE Sand BE S such that A and B are logically 
incompatible 


Note how these axioms take the notions of “tautology” and ‘logical incompatibility’ 
as antecedently understood. To this extent, one may regard probability theory as 
parasitic on deductive logic 

Kolmogorov then allows @ to be infinite. A non-empty collection J of subsets of 
is called a sigma field (or sigma algcora, or Borel field) on Qf (if and only if) ¥ 
is closed under complementation and countable union. Define a probability measure 
T{-) on #9 a function from F to (0, 1] satisfying axioms 1-3, as before, and also: 


Axiom 4 (Continuity) EB, = P(E,)-+ 0 (where, for every m, Fy € 7) 


{As shrinking sets converge to the aull set, their probability approaches 0.) Equi- 
valently, we can replace the conjunction of axioms 3 and 4 with a single axiom: 


Axiom 3° (Countable additivity) If {A,) is a countable collection of (pair 
wise) disjoint sets, each € , then 


| U4]-Enas 


‘The conditional probability of A given B, P(A| ), is standardly given by the ratio of 
‘unconditional probabilities: 


ANB) 
PB) 


‘This is often taken to be the definition of conditional probability, although it should 
bbe emphasized that this is a technical usage of the term that may not align perfectly 
with a pretheoretical concept that we might have. For example, it seems that we can 
‘make sense of conditional probabilities that are defined in the absence of the requisite 
‘unconditional probabilities, or when the condition B has probability 0 (Hick, 2001), 

‘Various important theorems can now be proved, among them the law of total 
probability: 


RAL B)= 





provided P(B) > 0. 


P(A) = PAI B) - PUB) + P(A] ~B) -P(-B) 
Even more importantly, various versions of Bayes’ sheorem can be proved: 


_ MBL A)-RA) 
AL By 


7 BL A)-KA) 
WRB] A)-A)+ MB |-A)- PCA) 
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‘These are all of the essentials of the mathematical theory of probability that 
will be needed here." Jefirey (1992) stresses various analogies between this ‘formal 
probability logic’ and deductive logic. Here is an important one: theorems such as 
these enable one, given certain probabilities, to calculate further probabilities, How- 
ever, the probability calculus does not itself determine the probabilities of any 
sentences, apart from 1 for tautologies and 0 for contradictions. Such values need to 
bbe provided ‘from the outside,” with probability theory only providing a framework. 
‘Compare this with deductive logic, which says which sentences are consistent with 
others, and which sentences are implied by others, but which does not itself deter- 
‘mine the truth values of any sentences, apart from ‘true’ for tautologies and “false” 
for contradictions. 

How, then, are probability values determined in the first place? This raises the 
issue of what probabilities are — ic, the so-called interpretation of probability 


16.3, Interpretations of Probability 


‘The mathematics of probability is well understood, but its interpretation is con- 
troversial. Kolmogorov's axioms are remarkably economical, and as such they admit 
‘of many interpretations. This section briefly presents some of the best known ones, 
‘emphasizing various connections to logic along the way. 


16.3.1. The dassical interpretation 


According to the classical interpretation ~ championed, for example, by Laplace ~ 
‘when evidence equally favors cach of various possibilities, or there is no such evid- 
‘ence at all, the probability of an event is simply the fraction of the total number of 
possibilities in which the event occurs ~ this is sometimes called the principle 

ifference. Thus, the modalities of possibility and probability are intimately 
related. For example, the probability of a fair die landing with an even number 
showing up is 4. Unfortunately, the prescription can apparently yield contradictory 
results when there is no single privileged set of possibilities. And even when there is, 
critics have argued that biases cannot be nuled out # priori. Finally, clasical prob: 
abilities are only finitely additive, so they do not provide an interpretation of the fall 
Kolmogorov calculus. 





163.2. The lagical interpretation 


Logical theories of probability are descendants of the classical theory. They general 
ize the notion that probability is to be computed in the absence of evidence, or on 
the basis of symmetrically balanced evidence, to allow probability to be computed 
fon the basis of the evidence, whatever it may be. At least in their earlier forms, 
logical theories saw the probability of a hypothesis given such evidence as objectively 
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and uniquely determined, and thus ideally to be agreed on by all rational agents. 
If one thinks that there can be, besides deductive implication, a weaker relation of 
partial implication, then one may also think of logical probability as an analysis of 
‘degree of implication.” This interpretation more than any other explicitly sees prob- 
ability as part of logic, namely inductive logic. 

Early proponents of logical probability include Keynes (1921), W. E. Johnson 
(1932), and Jeffreys (1939). However, by far the most systematic study of logical 
probability was by Camap. He thought of probability theory as an elaboration of 
deductive logic, arrived at by adding extra rules. Specifically, he sought ro explicate 
‘the degree to which hypothesis # is confirmed by evidence ¢,’ with the ‘correct’ 
conditional probability c(b, ¢) its explication. Statements of logical probability such 
as ‘c(h, €) =x were then to be thought of as logical truths. 

‘The formulation of logical probability begins with the construction of a formal 
language. Carnap (1950) initially considers a class of very simple languages consist- 
ing of a finite number of logically independent monadic predicates (that name 
properties) applied to countably many individual constants (that name individuals) 
‘or variables, and the usual logical connectives, The strongest (consistent) statements 
that can be made in a given language describe all of the individuals in as much detail 
as the expressive power of the language allows. They are conjunctions of complete 
«descriptions of each individual, each description itselfa conjunction containing exactly 
‘one occurrence (negated or unnegated) of each predicate letter of the language. Call 
these strongest statements stare descriptions. 

‘An inductive lagic for a language is a specification, for cach pair of statements 
(9, #) of the language, a unique probability value, or degree of confirmation cf, 4) 
To achieve this, begin by defining a probability measure m(-) over the state 
descriptions. Every sentence # of a given language is equivalent to a disjunction 
‘of (mutually exclusive) state descriptions, and its a prieri probability m(}) is thus 
determined. m in tum will induce a confirmation function c(-,~) according to the 
conditional probability formula: 


(be) 
mle) 


‘There are obviously infinitely many candidates for such an m, and hence c, even for 
very simple languages. However, Camap favors one particular measure, which he 
calls ‘m*”. He argues that the only thing that significantly distinguishes individuals 
from one another is some qualitative difference, not just a difference in labeling. 
Define a structure description as a maximal set of state descriptions, each of which 
can be obtained from another by some permutation of the individual names, m* 
assigns numbers to the state descriptions as follows: first, every structure description 
is assigned an equal measure; then, each state description belonging to a given 
structure description is assigned an equal share of the measure assigned to the 
structure description, From this, one can then define 


the) = 





m*(b&e) 


ehe= 
mere) 
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m® gives greater weight to homogenous state descriptions than to heterogeneous 
‘ones, thus “rewarding” uniformity among the individuals in accordance with puta- 
tively reasonable inductive practice. It can be shown that c* allows inductive learning, 
from experience. However, even insisting that an acceptable confirmation function 
‘must allow such learning, there are still infinitely many candidates; there is no reason 
yet to think that e* is the right choice. Carnap realizes that there is some arbitrari- 
‘ness here, but nevertheless regards c* as the proper function for inductive logic — he 
thinks it stands out for being simple and natural 

He later generalizes his confirmation function to a continuum of confirmation 
factions «,, each of which gives the weighted average (weighted according to 4 
positive real number 2) of an a priori value of the probability in question, and that 
calculated in the light of evidence, Define a family of predicates 10 be a set of 
predicates such that, for each individual, exactly one member of the set applies 
Camap goes on to explore first-order languages containing a finite number of 
families of predicates. Carap (1963) considers the special case of a language con- 
taining only one-place predicates. He lays down a host of axioms conceming the 
confirmation function ¢, including those induced by the probability calculus itself, 
various axioms of symmetry (for example, that c(d, ¢) remains unchanged under 
permutations of individuals, and of predicates of any family), and axioms that 
‘guarantee undogmatic inductive learning, and long-run convergence to relative 
frequencies, They imply that, for a family (P,J, m= 1,..., b, > 2: 


(individual s+1 iP, 5 ofthe fit + ndviuls are ) = 4°A/E 


the fess impact 





where 2 is a positive real number. The higher the value of 2 is 
evidence has. 

‘The problem remains: what is the correct setting of 2? And problems remain even 
‘once one has fixed the value of A. It turns out that a universal statement in an 
infinite universe always receives zero confirmation, no matter what the (finite) evi 
dence, Many find this counterintuitive, since laws of nature with infinitely many 
instances can apparently be confirmed. Hintikka (1965) provides a system of con: 
firmation that avoids this problem. 

Recalling an objection to the classical interpretation, the various axioms of sym 
metry are hardly mere truths of logic. More seriously, one cannot impose further 
symmetry constraints that are seemingly just as plausible as Camap’s, on pain of 
inconsistency (Fine, 1973, p. 202). Moreover, Goodman's “grue’ paradox appar- 
ently teaches one that some symmetries are better than others, and that inductive 
logic must be sensitive t0 the meanings of predicates, strongly suggesting that a 
purely syntactic approach such as Carnap's is doomed. One could try to specify a 
canonical language, fre of such monstrosities as ‘grue’, to which such syntactic rules 
might apply. But traditional logic finds no need for such a procedure, sharpening, 
the suspicion that Camap’s program is net merely that of extending the boundaries 
Of logic. Scott and Krauss (1966) use model theory in their formulation of logical 
probability for richer and more realistic languages than Carnap's. Still, finding a 
canonical language seems to many to be a pipe dream, at least if one wants to 
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analyze the ‘logical probability’ of any argument of real interest ~cither in science, 
(or in everyday life 


16.3.3. Frequency interpretations 


‘The guiding empiricist idea of frequency interpretations, which originated with 
Venn, is that an event's probability is the relative frequency of events of that type 
within a suitably chosen reference class, The probability that a given coin lands 
‘heads’, for example, might be identified with the relative frequency of ‘heads’ 
outcomes in the class of all tosses of that coin. But there is an immediate problem: 
observed relative frequencies can apparently come apart from true probabilities, as 
when a fair coin that is tossed ten times happens to land heads every time, Von 
Mises (1957) offers a more sophisticated formulation based on the notion of a 
collective, rendered precise by Church: a hypothetical infinite sequence of ‘atributes? 
(possible outcomes) of a specified experiment, for which the limiting, relative fre- 
quency of any attribute exists, and is the same in any recursively specified sub- 
sequence, The probability of an attribute A, relative to a collective «is then defined 
as the limiting relative frequency of A in @. Limiting relative frequencies violate 
countable additivity, and the domain of definition of limiting relative frequency is 
‘not even a field. Thus it does not genuinely provide an interpretation of Kolmogorov's 
probability calculus 

‘As well as giving a (hypothetical) limiting relative frequency interpretation of 
probabilities of events, Reichenbach (1949) gives an interpretation in terms of truth 
frequencies: the probability of truth of a statement of a certain type is the limiting 
relative frequency of statements of that type being true in a specified reference class 
Of statements. Out of this definition, he constructs a probability logic with a con- 
tinuum of truth values, corresponding to the various possible probabilities. 

‘A notorious problem for any version of frequentism isthe so-called problem of the 
single case: sometimes non-trivial probabilities are attributed to results of experi- 
‘ments that occur only once, and that indeed may do $0 of necessity. Moreover, 
frequentist probabilities are always relativized to a reference class, which needs to be 
fixed in a way that does not appeal to probability; but most events belong to many 
‘natural’ reference classes, which need not agree on the required relative frequency, 
In this sense, there may be no such thing as the probability of a given event ~ the 
infamous reference class problem. The move to hypothetical infinite sequences of 
trials creates its own problems: There is apparently no fact of the matter as to what 
such a hypothetical sequence would be, nor even what its limiting relative frequency 
for a given attribute would be, nor indeed whether that limit is even defined; and 
the limiting relative frequency can be changed to any value one wants by suitably 
permuting the order of trials. In any case, the empiricist intuition that facts about 
probabilities are simply facts about pattems in the actual phenomena has been 
jettisoned. Still more sophisticated accounts, frequentist in sprit, uphold this intui 
tion — see, for instance, Lewis (1994). Such accounts sacrifice another intuition: that 
itis buile into the very concept of ‘chanciness’ that fixing what actually happens does 
not fx the probabilities. 
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16.3.4. Propensity interpretations 


Arzempts to locate probabilities ‘in the world’ are also made by variants of the 
propensity interpretation, championed by such authors as Popper (1959b), Mellor 
(1971) and Giere (1973). Probability is thought of as a physical propensity, or 
disposition, or tendency of a given type of physical situation to yield an outcome of 
a certain kind, of to yield a long-run relative frequency of such an outcome. This 
view is explicitly intended to make sense of single-case probabilities, such as ‘the 
probability that this radium atom decays in 1500 years is$.” According to Popper, 
probability p of an outcome of a certain type is a propensity of a repeatable experi- 
‘ment to produce outcomes of that type with limiting relative frequency p. With its 
heavy reliance on limiting relative frequency, this position risks collapsing into von 
Misesstylefrequentism. It seems moreover not to be a genuine interpretation of the 
probability calculus at all, for the same reasons that limiting relative frequentism is 
not. Giere, on the other hand, explicitly allows single-case propensities, with no 
‘mention of frequencies: probability is just a propensity of a repeatable experimental 
set-up to produce sequences of outcomes. This, however, creates the opposite prob: 
lem to Popper's: How, then, does one achieve the desired connection between 
probabilities and frequencies? Indeed, it is not clear why the assignments of such 
propensities should obey the probability calculus at all. For reasons such as these, 
propensity accounts have been criticized for being unacceptably vague. 


16.3.5. The subjectivise interpretation (subjective Bayesianism) 


Degrees of belief Subjectivism is the doctrine that probabilities can be regarded as 
degrees of belief, sometimes called eredences. It is often called “Bayesianism’ thanks 
to the important role that Bayes’ theorem typically plays in the subjectivist’s calcula 
tions of probabilities, although the theorem itself is neutral regarding interpretation. 
Unlike the logical interpretation (at least as Carnap originally conceived it), sub- 
jectivism allows that different agents with the very same evidence can rationally give 
different probabilities to the same hypothesis. 

But what is a degree of belief? A standard analysis invokes betting behavior: an 
agent’s degree of belt in A is p iff the agent is prepared to pay up to p units for a 
bet that pays 1 unit if A, 0 ifnot A (de Finetti, 1980). It is assumed that the agent 
is alo prepared to sell that bet for p units. Thus, here is an operational definition of 
subjective probability, and indeed it inherits some of the difficulties of operationalism 
in general, and of behaviorism in particular. For example, the agent may have reason 
to misrepresent her true opinion. Moreover, as Ramsey (1980 [1931] points out, 
the proposal of the bet may itself alter her state of opinion; and the agent might 
have an eagerness oF reluctance to bet. These problems are avoided by identifying, 
the agent's degree of belief in a proposition with the betting price she regards as fair, 
‘whether or not the agent enters into such a bet (Howson and Urbach, 1993). Still, 
the fair price of a bet on A appears to measure not the agent's probability that A will, 
be the case, but rather the agent's probability that A will be the case and that the 
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prize will be paid, which may be rather less - for example, if A is unverifiable. Some 
think that this commits proponents of the betting interpretation to an underlying 
intuitionistic logic. 

If no restriction is placed on who the agent is, one would not have an interpreta- 
tion of the probability calculus at all, for there would be no guarantee that the 
agent’s degrees of belief would conform to it. Haman agents sometimes violate the 
probability calculus in alarming ways (Kahneman ct al., 1982), and indeed conform: 
ing all our degrees of belief to the probability calculus is surely an impossible 
standard. However, if attention is restricted to ideally rational agents, the claim that 
degrees of belief are at least finitely additive probabilities becomes more plausible 
(much as deductive consistency of one’s beliefs might be an impossible standand but 
a reasonable ideal). 

So-called Dutch Book arguments provide one important line of justification of this 
claim, A Durch Book is a series of bets, each of which the agent regards as fair, but 
collectively guarantee the agent's loss. De Finetti (1980) proves that if one’s 
degrees of belief are not finitely additive probabilities, then one is susceptible to a 
Dutch Book. Equally important, and often neglected, is Kemeny's (1955) converse 
theorem: If an agent’s degrees of belief are finitely additive probabilities, then no 
Dutch Book can be made against the agent. 

AA related defence of the probability axioms comes from wtlity theory. Ramsey 
(1980 [1931}) derives both probabilities and utilities (desirabilities) from rational 
preferences. Specifically, given various assumptions about the richness of the prefer- 
cence space, and certain ‘consistency’ assumptions, he shows how to define a real- 
valued utility function of the outcomes ~ in fact, various such functions will represent 
the agent's preferences. It tums out that ratios of utility-differences are invariant, the 
same whichever representative utility function is chosen. This fact allows Ramsey to 
define degrees of belief as ratios of such differences, and to show that they are 
finitely additive probabilities. However, it is dubious that consistency requires one to 
have a set of preferences as rich as Ramsey requires. This places strain on Ramsey's 
claim to assimilate probability theory to logic. However, Howson (1997) argues 
that a betting interpretation of probability underpins a soundness and completeness 
proof of the probability axioms, thus supporting the claim that the Bayesian theory 
does provide a logic of consistent belief. 

Savage (1954) likewise derives probabilities and utilities from preferences among 
‘options that are constrained by certain putative ‘consistency’ principles. Jefrey (1983) 
refines the method farther, giving a “logic of decision’ according to which rational 
choice maximizes expected utility, a certain probabilty-weighted average of utilities, 


Updating probability Suppose that an agent's degrees of belief are initially repres- 
‘ented by a probability function P..(-), and that the agent becomes certain of a 
piece of evidence E. What should be the agent's new probability function P...? To 
avoid any gratuitous changes in the agent’s degrees of belie that were not prompted 
by the evidence, P,.. should be the minimal revision of Pay subject to the constraint 
that P..,(E)= I. The favored updating rule among Bayesians is conditionalization: 
Pe is derived from Pau by taking probabilities conditional on E, according to the 
schema: 
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(Conditionatization) — Pica A)=Poaa Al E) (provided Pras E) > 0) 


Lewis (1998) gives a ‘diachronic’ Dutch Book argument for conditionalization: if 
‘one’s updating is rule-govemed, one is subject to a Dutch Book (at the hands of 
4 bookie who knows the rule employed) if one does not conditionalize, Equally 
important is the converse theorem: if one does conditionalize, one cannot be Dutch 
Booked (Skyrms, 1987) 

Now suppose that, as the result of some experience, the agent's degrees of belief 
across a countable partition {E,, E,,.. | change to (P(E), Puo(Ez)s-» «by Where 
none of these values need be 1 or 0. The rule of Jeffrey conditionalization, oF 
probability kinematics, relates the agent's new probability function to the initial one 
according to 


Pen(A) = J Pont A | E,)Pa(E,) 


AF the probabilities are only finitely additive, then the partition and sum must be 
finite.) Conditionalization can be thought of as the special case of Jeffrey condi 
tionalization in which P(E) = 1 for some é Iefitey conditionalization ix supported by 
Dutch Book argument due to Armendt (1980); itis also the rule that, subject t0 
the constraints on the partition, minimizes a measure of ‘distance’ in function space 
between the initial and new probability functions, called ‘cross-entropy’ (Diaconis 
and Zabell, 1982), 
Orthodox Bayesianism can now be characterized by the following, maxims: 





BL. The rational agen 
calculus, 

B2 The rational agent's probabilities update by the ule of (Jeffrey) 
conditionalization, 

B3. There are no further constraints on the rational agent 


‘prior’ (inital) probabilities conform to the probability 


‘Some critics reject orthodox Bayesianism’s radical permissiveness regarding prior 
probabilities. A standard defense - e-g., Howson and Urbach (1993), and Savage 
(1954) ~ appeals to famous ‘convengence-to-truth,” and ‘merger-oF opinion’ results 
Roughly, their content is that with probability 1, in the long run, the effect of 
choosing one prior rather than another is washed out, Successive conditionalizations 
‘on the evidence will make a given agent eventually converge to the truth, and thus, 
initially discrepant agents eventually come to agree with each other (assuming that 
the priors do not give probability 0 to the truth, and that the stream of incoming 
evidence is sufficiently rich). In an important sense, atleast this much inductive logic 
is implicit in the probability calculus. 
But Bayesianism is a theme that admits of many variations. 


Further constrainss on subjective probabilities Against (B3), some less permissive 
Bayesians also require that a rational agent's probabilities be regular, o¢ strictly 
coherent: if P(A)= 1, then A is a tautology. Regularity is the converse of axiom Il 
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(page 363), again linking probability and logic. It is meant to guard against the sort 
‘of dogmatism that no course of learning by (Jeffrey) conditionalization could cure. 

‘Van Fraassen (1995) suggests a further constraint on rational opinion called reflec 
tion, involving credences about one’s own future credences. Here is one formulation: 


(Reflection) P{A|Pn(A)=x)=x  (A>0) 


where P, is the agent’s probability function at time £. The idea is that when all is 
well, a certain sort of epistemic integrity requires one to regard one’s future opinions 
as being trustworthy, having arisen as the result of a rational process of learning. A. 
‘more general version of reflection is presented by Goldstein (1983). 

Lewis (1986¢ [1980]) offers a principle that links objective chance and rational 
credence: 


(Principal Principle) P{Alch{A)=x & E)=x 


where ch,(A) is the objective chance of A at time £, and E is any information that is 
‘admissible’ at time # (roughly, gives no evidence about the actual truth value of A). 
For example, the principle says: given that this coin is believed to be fair, and thus 
has a chance of } at the moment of landing heads at the next toss, one should assign 
credence } 0 that outcome. The principle can, on the other hand, be thought of as 
tsiving an implicit characterization of chance as a theoretical property whose di 

tinctive role is to constrain rational credences in just this way. See Lewis (1994), 
Hall (1994b), and Thau (1994) for refinements of the principle 

Finally, there have been various proposals for resuscitating symmetry constraints 
‘on priors, in the spirit of the classical and logical interpretations. More sophisticated 
versions of the principle of indifference have been explored by Jaynes (1968) and 
Paris and Vencovsks (1997). Their guiding idea is to maximize the probability 
function's entropy, which for an assignment of positive probabilities py... Be to % 
worlds equals -¥, p, logl p)). 

Orthodox Bayesianism has also been denounced for being overly demanding: its 
requirements of sharp probability assignments to all propositions, logical omniscience, 
and so on have been regarded by some as unreasonable idealizations. This motivates 
various relaxations of tenets (B1) and (2) above. (B2) might be weakened to allow 
other rules for the updating of probabilities besides conditionalization ~ for example, 
revision to the probability function that maximizes entropy, subject to the relevant 
‘constraints (Jaynes, 1968; Skyrms, 19872). And some Bayesians drop the requirement 
that probability updating be rule-governed altogether, ¢-g., Earman (1992). 

‘The relaxation of (B1) is a large topic, and it motivates some non-Kolmogorovian 
theories of probability. 


16.4. Non-Kolmogorovian Theories of Probability 


A number of authors would abandon the search for an adequate interpretation of 
‘Kolmogorov’s probability calculus, since they abandon some part of his axiomatization. 
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Abandoning the sigma fied subrstructwre Fine (1973) argues that requiring the 
domain of the probability function to be 2 sigma field is overly restrictive. For 
example, one might have limited census data on race and gender that gives good 
information conceming the probability P(M) that a randomly chosen person is male, 
and the probability P(B) that such a person is black, without giving any information 
about the probability P(M / B) that such a person is both male and black, 


Alnndoning sharp probabilities Each Kolmogorovian probability isa single number, 
But suppose that an agent's state of opinion does not determine a single probability 
function, but rather is consistent with a multiplicity of such functions. In that case, 
‘one might represent the agent’s opinion as the set of all these functions; see, for 
‘example, Jefiey (1992) and Levi (1980). Each function in this set corresponds to & 
‘way of precisifying an agent’s opinion in a legitimate way. This approach will typi- 
«ally coincide with interval-valued probability assignments, but it need not. Koopman 
(1980 [1940]) offers axioms for ‘upper’ and ‘lower’ probabilities which may be 
thought of as the endpoints of such intervals. See also Walley (1991) for an exten- 
sive treatment of imprecise probabilities. 


Abandoning numerical probabilities alzegether In contrast t0 the ‘quantitative? 
probabilities so far assumed, Fine (1973) sympathetically canvases various theories 
of comparative probability, exemplified by statements of the form *A is at least as 
probable as BY (A= B). He offers axioms governing “=,” and explores the condi- 
tions under which comparative probability can be given a representation in terms of 
Kolmogorovian probabilities. 


Negative and complex-valued probabilities More radically, physicists such as Dirac, 
‘Wigner, and Feynman have countenanced negative probabilities. Feynman, for in- 
stance, suggests that a particle diffusing in one dimension in a rod has a probability 
of being at a given position and time that is given by a quantity that takes negative 
values. Depending on how one interprets probability, however, one may instead 
want to say that this function bears certain analogies to a probability function, but 
when it goes negative the analogy breaks down. Cox allows probabilities to take 
values among the complex numbers in his theory of stochastic processes having 
discrete states in continuous time. Sec Mackenheim (1986) for references. 


Abandoning the normalization axiom \t might seem entirely conventional that the 
‘maximal value a probability function can take is 1. However, it has some non-trivial 
consequences. Coupled with the other axioms, it guarantees that a probability func 
tion takes at least two distinct values, whereas setting P(Q)=0 would not. More 
significantly, itis non-trivial that there is a maximal value. Other measures, such as 
length or volume, are not so bounded. Indeed, Renyi (1970) drops the normaliza 
tion assumption altogether, allowing probabilities to attain the ‘value’ «, 

‘Some authors would loosen the rein that classical logic has on probability, allow: 
ing logical/necessary truths to be assigned probability less than one ~ perhaps 
because logical or mathematical conjectures may be more or less well confirmed: see, 
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for example, Polya (1968). Furthermore, axiom IT (page 363) makes reference to 
the notion of ‘tautology,’ with classical logic implicitly assumed. Proponents of non- 
classical logics may wish to employ instead their favorite “deviant” notion of ‘tautol- 
ogy” (perhaps requiring corresponding adjustments elsewhere in the axiomatization). 
‘Thus, constructivist theories ground probability theory in intuitionistic logic. 


Infinitesimal probabilities Kolmogorox’s probability functions are real-valued. A 
number of philosophers ~ e.g., Lewis, 1986c [1980], and Skyrms, 1980 — drop this 
assumption, allowing probabilities to take valucs from the real numbers of a non- 
standard model of analysis ~ sce Skyrms (1980, App. 4) for the construction of 
such a model. In particular, they allow probabilities to be infinitesimal: positive, but 
smaller than every (standard) real number. Various non-empty propositions in ingi- 
nite probability spaces that would ordinarily receive probability zero according to 
standard probability theory, and thus essentially treated as impossible, may now be 
assigned positive probability. (Consider selecting at random a point from the (0, 1] 
interval.) In uncountable spaces, regular probability functions cannot avoid taking 
infinitesimal values 





Abandoning countable additivity Kolmogoron's most controversial axiom is un- 
doubtedly continuity — ie. the “infinite part’ of countable additivity. He regarded it 
«4 an idealization that finessed the mathematics, but that had no empirical meaning. 
‘As has been scen, according to the classical, frequency, and certain propensity inter- 
pretations, probabilities violate countable additivity. De Finetti (1972) marshals a 
battery of arguments against it. Here is a representative one: Countable additivity 
requires one to assign an extremely biased distribution to a denumerable partition of 
‘events, Indeed, for any € > 0, however small, there will be a finite number of events 
that have a combined probability of at east 1 ~€, and thus the lion’s share of all the 
probability, 


Abandoning finite additivity Various theories of probability that give up even finite 
additivity have been proposed — so-called non-additire probability theotics. 

Dempster-Shafer theory begins with a frame of discernment , a partition of 
hypotheses. To each subset of Q, assign a ‘mass’ between 0 and 1 inclusive; al the 
masses sum to 1. Then define a belief function Bel( A) by the rule: for each subset A 
‘of ©, Bel(A) is the sum of the masses of the subsets of A. Shafer (1981) gives this 
interpretation: Suppose that the agent will find out for certain some proposition 
fon Q. Then Bel(A) is the agent's degree of belief that he will find out A. 
Bel(A) + Bel(-A) need not equal 1; indeed, Bel(A) and Bel(—A) are functionally 
independent of each other. Belief functions have many of the same formal properties 
as Koopman’s lower probabilities. Mongin (1994) shows that there are important 
links between epistemic modal logics and Dempster-Shafer theory. 

So-called ‘Baconian probabilities’ represent another non-additive departure from 
the probability calculus. The Baconian probability of a conjunction is equal to the 
minimum of the probabilities of the conjunets. Such ‘probabilities’ are formally 
similar to membership functions in fuzzy logic. Cohen (1977) regards them as 
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appropriate for measuring inductive support, and for assessing evidence in a court 
of law. 

For further non-additive probability theories, see (among others) Ghirardato's 
modeling of ambiguity aversion, Shackle’s potential surprise functions, Dubois and 
Prade’s theory of fizzy probabilities, Schmeidler’s and Wakker's respective theories 
of expected utility, and Spobn’s theory of non probabilistic belief functions. Ghirardato 
(1993) and Howson (1995) have references and more discussion, 


Conditional probability as primitive According to each of the interpretations of 
probability that have been discussed, probability statements are always atleast tacitly 
relativized. On the classical interpretation, they are relativized to the set of possibili 
ties under consideration; on the logical interpretation, to an evidence statement; on 
the frequency interpretations, to a reference class; on the propensity interpretation, 
to a chance set-up; on the subjective interpretation, to a subject (who may have 
certain background knowledge) at a time. Perhaps, then, itis conditional probability 
that is the more fundamental notion. 

Rather than axiomatizing unconditional probability, and later defining conditional 
probability therefrom, Popper (1959a) and Renyi (1970) take conditional prob- 
ability as primitive, and axiomatize it directly. Popper’s system is more fami 
philosophers. His primitives are: 





(i), the universal ser; 
(i of the elements of 
iii) a binary operation ab defined for cach pair (a, #) of elements of 2; 


(iv) a unary operation ~w defined for each element a of 0. 





Each of these concepts is introduced by a postulate (although the frst actually plays 
1no role in his theory): 


Postulate 1 The number of elements in Q is countable. 


Postulate 2 If @ and bare in Q, then p(a, 6) isa real number, and the following, 
axioms apply: 


Al Existence There are elements cand d in © such that p(a, 6) # plc 4). 


A2 Subtitueivity If pa) = p(b, 6) forall cin O, then pd, a)= pid, 6) for 
all din Q. 





Postulate 3 If « and bare in ©, then ab is in Q; and if ¢ is also in O, then: 


B2 Monotony lab, c) = pla, ©) 
B2 Mulsiplication piab, ¢)=pla, beip(6, ©) 








375, 





Alan Hijjek 
Postulate 4 If a is in Q, then — is in Q; and if b is alo in Q, then: 


© Complementation ple, b) +a, b) =p &), anes Ub 8) = ple 8 for 


Popper also adds a ‘fifth postulate,’ which may be thought of as giving the 
definition of absolute (unconditional) probability: 


Postulate AP If « and 6 are in ©, and if p(4, ¢) = p(«, 6) for all cin , then 
Pla)= pla, 6) 


Here, 6 can be thought of as a tautology. Unconditional probability, then, is prob- 
ability conditional on a tautology. Thus, Popper's axiomatization generalizes ordi- 
nary probability theory. A function p(-,-) that satisfies the above axioms is called a 
Popper function 

‘An advantage of using Popper functions is that conditional probabilities of the 
form p(a, 6) can be defined, and can have intuitively correct values, even when b has 
absolute probability 0, rendering the usual conditional probability ratio formula 
inapplicable. For example, the probability that a randomly selected point from 
10, 1] is}, given E=*itis either J or }, is plausibly equal to $, and a Popper function 
can yield this result; yet the probability of E is standardly taken to be 0. Popper 
functions also allow a natural gencralization of updating by conditionalization, so 
that even items of evidence that were originally assigned probability 0 by an agent 
can be learned. McGee (1994) shows that, in an important sense, probability state- 
‘ments cast in terms of Popper functions and those cast in terms of nonstandard 
probability functions are inter-translatable. 


16.5. Probabilistic Semantics and Probability Propagation 


Various notions from standard semantics can be recovered by probabilistic semantics. 
“The central idea is to define the logical concepts in terms of probabilistic ones. 
Alternative axiomatizations of probability will then give rise to alternative logics 


165.1. Probabilistic semantics 


Call a statement A of a firnt-order language < leically true in the probabilistic sense 
if for all probability functions P, P(A) = 1. Where 5 is a set of statements of £, say 
that A is lagically entailed by S in the probabilistic sense if, for all P, P(A)=1 if 
P(B)= 1 for cach member B of 5. This sense of logical entailment is strongly sound 
and strongly complete: 5+ A iff A is logically entailed by s in the probabilistic sense. 
And taking 5 to be , then + A iff A is logically true in the probabilistic sense 
Popper functions also permit nanural definitions of logical ruth and logical entailment, 
with analogous soundness and completeness results (Kyburg, 1970, pp. 245 ff.) 
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‘Van Fraassen (1981) exploits (slightly differently axiomatized) primitive conditional 
probability functions in providing probabilistic semantics for intuitionistic propositional 
logic and classical quantifier logic. Probabilistic semantics have been supplied for 
first-order logic with and without identity, modal logic, and conditional logic. See 
Leblanc (1983) for a good general survey and for references. Van Fraassen (1983) 
offers such semantics for relevant logic; Pearl (1991) has a general discussion of 
probabilistic semantics for nonmonotonic logic: 

Probabilistic semantics represent a limiting case of the idea that a valid argument 
is one in which it is not possible for the probabilities of all of the premises 10 be 
high, while the probability of the conclusion is not. More generally, what can be 
said about the propagation of probability from the premises to the conclusion of a 
valid argument? 


165.2. Probability propagation: Adams? probability lagic 


If the premises of a valid argument are all certain, then so is the conclusion, Sup- 
pote, on the other hand, that the premises are not all certain, but probable to 
various degrees; can one then put bounds on the probability of the conclusion? Or 
‘suppose that onc wants the probability of the conclusion of a given valid argument 
to be above a particular threshold; how probable, then, must the premises be? These 
{questions are pressing, since in real-life arguments one typically is not certain of the 
premises, and it may be important to know how confidently one may hold their 
conclusions. Indeed, one knows from the lottery paradox that each premise in a 
valid argument can be almost certain, while the conclusion is certainly false, ‘Prob- 
ability logic’ is the name that Adams (1998) gives to the formal study of such 
‘questions ~ the study of the transmission (or lack thereof ) of probability through 
valid inferences. A sketch of his treatment is given here. 

‘The hallmark of his probability logic is that traditional concerns with truth and 
falsehood of premises and conclusions are replaced with concems about their prob- 
abilities. This, in turn, leads to the nonmonotonic nature of probability logic: 
‘conclusion that is intially assigned high probability and hence accepted may later 
retracted in the face of new evidence. Define the uncertainty u(F) of a sentence by 








u(F)=1-%(F) 


Various important results in probability logic are more conveniently stated in terms 
of uncertainties rather than probabilities. For example: 


Valid inference uncertainty theorem (VIUT) The uncertainty ofthe conclusion 
of a valid inference cannot exceed the sum of the uncertainties of the premises. 


Hence, the uncertainty of the conclusion of a valid inference can only be large if the 
sum of the uncertainties of the premises is large ~ witness the lottery paradox, in 
which many small uncertainties in the premises accumalate to yield a maximally 
uncertain conclusion. In particular, if each premise has an uncertainty no greater 
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than e, then there must be at least 1/e of them for the conclusion to have maximal 
‘uncertainty. 

‘The VIUT gives a bound on the uncertainty of the conclusion. Under certain 
circumstances, the bound can be achieved. Call a premise of a valid inference esen- 
tial ifthe inference that omits that premise but that is otherwise the same is invalid, 
We have: 


Uncertainty bound attainment theorem Suppose F,,..., F, > Fis valid, and 
let m,-.., be nonnegative, with Eu, = 1. If the premises are consistent and 
all essential, then there is an uncertainty function u(-) such that u(F)= u, for 
GAs eos mand uF) ay t-te 





However, such ‘worst case’ uncertainties in a conclusion can be reduced by intro- 
ducing redundancy among the premises from which itis derived. For it can further 
be shown that, given a valid inference with various premises, different subsets of 
which entail the conclusion, the conclusion’s uncertainty cannot be greater than the 
total uncertainty of that subset with the smallest total uncertainty. Define a minimal 
‘sential premise set to be an essential premise set that has no proper subsets that 
are essential. Suppose that there is a valid inference with premises F,,... Fy. The 
degree of esentialness of premise F, €(F), is: 1/b, where & is the cardinality of the 
smallest essential set of premises to which F, belongs, if F, belongs to some minimal 
essential set, and otherwise. Intuitively, ¢(F) is a measure of how much ‘work’ F, 
does in a vali inference. 

‘The VIUT can now be generalized: 


Theorem If Fi. ..5 Fe Fis valid, then u(F) = e(F)u(F) +--+ e(F,)u(F) 


‘One thus can lower the upper bound on u(F) from that given by the VIUT. 

‘Adams calls an inference probabilstically valid iff, for any € > 0, there exists a 
8>0 such that, under any probability assignment according to which each of 
the premises has probability greater than 1~8, the conclusion has probability 
at least 1~e. A linchpin of his account of probabilistic validity is his treatment of 
conditionals. According to him, a conditional has no truth value, and hence sense 
cannot be made of the probability of its erwth. Yet conditional can clearly figure 
cither as premises of conclusions of arguments, and one still wants to be able to 
assess these arguments. How, then, does one determine the probabilities of con- 
ditional? This leads to another important point of cross-fertilization between prob- 
ability and logic. 


16.6. Probabilities of Conditionals 


Probability and logic are intimately intertwined in the study of probabilities of 
conditionals. In the endeavor to furnish a logical analysis of natural language, the 
conditional has proved to be somewhat recalcitrant, and the subject of considerable 
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controversy. [See chapter 17.] Meanwhile, the notion of ‘conditionalty” is seem- 
ingly well understood in probability theory, and taken by most to be enshrined in 
the usual ratio formula for conditional probability. Thus, fecund research programs 
have been founded on both promoting and parrying a certain marriage between the 
logic of conditionals and probability theory: the hypothesis that probabilities of 
conditionals are conditional probabilities. More precisely, the hypothesis is that 
some suitably quantified and qualified version of the following equation holds: 


(PCCP) P(A B)=P{B|A) for all A, Bin the domain of P, with P(A) > 0 


where *-»" is a conditional connective 

‘The best known presentations of this hypothesis are due to Stalnaker (1970) and 
‘Adams (1975). Stalnaker hoped that a suitable version of it would serve as a cri- 
terion of adequacy fora truth-conditional account of the conditional. He explored the 
conditions under which it would be reasonable for a rational agent, with subjective 
probability function P, ro believe a conditional A —» B. By identifying the probability 
of A—+ B with P(B| A), Stalnaker was able to put constraints on the truth condi- 
tions of ‘> (for example, the upholding of conditional excluded middle) that 
supported his preferred C2 logic. While Adams eschewed any truth-conditional 
account of the conditional, he was happy to speak of the probability of a condi- 
tional, equating it to the corresponding conditional probability. This allowed him to 
‘extend his notion of probabilistic validity to arguments that contain conditionals — 
arguments that in his view lie outside the scope of the traditional account of validity 
couched in terms of truth values, He argued that the resulting scheme respects 
intuitions about which inferences are reasonable, and which not. 

With these motivations in mind, and for their independent interest, there are four 
salient ways of rendering precise the hypothesis that probabilities of conditionals are 
conditional probabilities: 


Universal version ‘There is some —» such that for all P, PCCP holds, 


Rational Probability Function version “There is some + such that forall rational 
subjective probability functions P, PCCP holds. 


Universal Tailoring version For each P there is some — such that PCCP holds. 


Rational Probability Function Tailoring version For each rational subjective prob 
ability function P, there is some —+ such that PCCP holds, 


If any of these versions can be sustained, then important links berween logic and 
probability theory will have been established, just as Stalnaker and Adams hoped. 
Probability theory would be a source of insight into the formal structure of condi 
‘ionals; and probability theory, in turn, would be enriched, since one could characterize 
more fully what the usual conditional probability ratio means, and what is use is. 

‘There is now 2 host of results ~ mostly negative - conceming PCCP, Some 
preliminary definitions will assist in stating some of the most important ones. IF 
PCCP holds (for a given -> and P) then ~ is said to be a PCCP conditional for, 
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and P is said to be a PCCP-fiunction for —>. If PCCP hokds for cach member P of a 
class of probability functions ‘9, then — is said to be a PCCP-zonditional for 0. A 
pair of probability functions P and Pare orthagonal if, for some A, P(A)=1 but 
P(A)=0. Call a proposition A a P-atem iff P(A)>0 and, for all X, cither 
P(AX) = P(A) or P(AX) =0. Finally, a probability function is called sriviad if it has 
at most four different values. 

‘The negative results are “trivalty results’: only trivial probability functions can 
sustain PCCP, given certain assumptions. The earliest and most famous results are 
due to Lewis (1986a {1976]), which he later strengthens (1986b). Their upshot is 
that there is no PCCP- conditional for any class of probability functions closed under 
conditionalizing (restricted to the propositions in a single finite partition), or under 
Jeffrey conditionalizing, unless the class consists entirely of trivial functions. These 
results refute the Universal version of the hypothesis. They also spell bad news for 
the Rational Probability Function version, since rationality surely permits having a 
‘non-trivial probability function and updating by (Jeffrey) conditionalizing. This version 
receives its death blow from a result by Hall (1994a) that significantly strengthens 
Lewis" results: 


Orthagonality result Any two non-trivial PCCP. functions for a given > with the 
same domain are orthogonal 


It follows from this that the Rational Probability Function version is true only if any 
‘wo rational agents’ probability functions are orthogonal if distinct ~ which is absurd, 

So far, the ‘ailoring’ versions remain unscathed. The Universal Tailoring version 
is refuted by the following result due to Hajek (1989, here slightly strengthened): 


Finite probability functions result Any non-trivial probability function with finite 
range has no PCCP-conditional 


This result also casts serious doubt on the Rational Probability Tailoring version, for 
it is hard to see why rationality requires one to have a probability function with 
infinite range. If one makes a minimal assumption about the logic of the ‘>,’ 
‘matters are still worse thanks to another result of Hall's (1994): 


No Atoms Result Given (2, 7, P), suppose that PCCP holds for P and a > that 
obeys modus ponens; then (2, 7, P) does not contain a P-atom, unless P is trivial 


It follows, on pain of trivialty, that the range of P, and hence @ and J, are 
uncountable. All the more, it is hard to sce how rationality requires this of an 
agent's probability space. 

Tt seems, then, that all four versions of the hypothesis so far considered are 
untenable. For all that has been said so far, though, ‘tailoring? version restricted t0 
‘uncountable probability spaces might still survive. Indeed, here there is a positive 
result due to van Fraassen (1976). Suppose that — distributes over 1 and U, obeys 
‘modus ponens and centering, and the principle that A> A=. Such an — conforms 
to the logic CE. Van Fraassen shows: 
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CE tenability result Any probability space can be extended to one for which 
PCCP holds, with an —+ that conforms to CE. 


Of course, the larger space for which PCCP holds is uncountable. He also shows 
that — can have still more logical structure, while supporting PCCP, provided one 
restricts the admissible iterations of —> appropriately. 

A similar strategy of restriction protects Adams’ version of the hypothesis from the 
negative results. He applies a variant of PCCP to unembedded conditionals of the 
form A B, where A and B are conditional-free. More precisely, he proposes: 


Adams’ Thesis (AT) For an unembedded conditional A> B 


B\A) if MA)>0 
nasvm a [M8 RA 
q orheewise 


Since Adams docs not allow the assignment of probabilities to Boolean compounds 
‘of conditionals, thus violating the closure assumptions of the probability calculus, 
“P* is not strictly speaking a probability function (and thus the negative results, 
which presuppose that itis, do not apply). McGce (1994) extends Adams’ theory to 
certain more complicated compounds of conditionals. He later refines AT, using, 
Popper functions to give a more nuanced treatment of conditionals with anteced- 
cents of probability 0, Finally, Stalnaker and Jefftey (1994) offer an account of the 
conditional as a random variable. They recover an analogue of AT, with expectations 
replacing probabilities, and generalize it to encompass iterations and Boolean com: 
pounds of conditionals. 

As the recency of much ofthis literature indicates, this is stil a flourishing field of 
research. The same can be said for virtually all of the points of contact between 
probability and logic that have been surveyed here. 


‘Suggested further reading, 


‘Skyrms (1999) is an excellent introduction to the philosophy of probability. Von Plato 
(1994) is more technically demanding and more historically oriented, with an extensive 
bibliography that has references to many landmarks in the development of probability theory 
this century. Fine (1973) is sill « highly sophisticated survey of and contribution to various 
foundational issues in probability. Billingsley (1995) and Feller (1968) are clasic textbooks 
‘on the mathematical theory of probability. Mockenheim (1986) surveys the literature on 
‘extended probabilities’ that take values outside the real interval [0, 1]. Eells and Skyrms 
(1994) isa fine collection of articles on probabilities of conditionals. Fenstad (1980) discusses 
further connections between probability and logic, emphasizing probability functions defined 
‘on formal languages, randomness and recursion theory, and non-standard methods. A vast 

5 ofthe literature on probability and induction pre-1970 can be found in Kyburg 
(1970). Also useful for references before 1967 is the bibliography for Probability” in the 
Encyclopedia of Philowply. Barman (1992) and Howson and Urbach (1993) have more recent 
bibliographies, and give denailed presentations of the Bayesian program? 
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Notes 


1 Lebesgue’s theory of measure and integration allows 2 highly sophisticated treatment of 
various farther concepts in probability theory ~ random variable, expectation, martingale, 
and so on ~ all based ultimately on the characterization of the probability of an event 
as the measure of a set. Important limit theorems, such as the laws of large numbers and 
the central limit theorem, are beyond the scope of this chapter. The interested reader is 
directed to references in the suggested further reading. 

2. Space limitations preclude discussing various other important approaches, including Dawid's 
(1992) prequential theory; Fisher's (1973) Sducal probability, explored further by Fisher, 
Kyburg (1974) and Seidenfeld (1992), those based on fuzzy logic, and those based on. 
complexity theory. 

3 Thisarticle was written mostly at Cambridge University, and I am grateful wo the Philosophy 
Department and to Wolfson College forthe hospitality Iwas shown there. I also especially 
thank Jeremy Butterfield, Alex Byrne, Tim Childers, Haim Gaifman, Matthias Hild, Chris: 
topher Hitchcock, Colin Howson, Paul Jefies, Isic Levi, Vann McGee, Teddy Seidenfeld, 
Brian Skyrms, Brian Weatherson and Jim Woodward for their very helpful comments 
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Chapter 17 
Conditionals 


Dorothy Edgington 
A simple statement 
Tt will rain soon. 
Mary cooked the dinner. 


can have a conditional clause attached, making a conditional statement: 


It will rain soon if the clouds don’t blow away. 
If John didn’t cook the dinner, Mary cooked it. 


‘These, traditionally called “indicative conditionals’, are my topic. I do not have space 
to discuss theories of ‘subjunctive’ or ‘counterfactual’ conditionals like 


John would have cooked the dinner if Mary had not done so, 
If the wind hadn't blown the clouds away, it would have rained. 


‘That there is some difference between indicatives and subjunctives is shown by pairs 
cof examples like 


If Oswald didn’t kill Kennedy, someone else did. 
and 
If Oswald hadn't killed Kennedy, someone else would have. 


‘One can accept the first but reject the second (Adams, 1970). That there is not a 
hhuge gulf between them is shown by examples like the following: 


“Don't go in there,” I say, “if you go in you will get hurt.” 
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You look skeptical but stay outside, when there is a loud crash as the ceiling collapses. 
Tsay, 


You see, if you had gone in you would have been hurt. J told you so. 


It is controversial how best to classify conditionals. According to some theorists, 
the forward-looking indicatives (those with a ‘will’ in the main clause) belong with 
the subjunctives (those with a ‘would’ in the main clause), and not with the other 
indicatives.' The easy transition from typical ‘wills’ to ‘woulds’is indeed a datum to 
be explained. Still, straightforward statements about the past, present of future, to 
which a conditional clause is attached ~ the traditional class of indicative conditionals 
~ do (in my view) constitute a single semantic kind. The theories I discuss do not 
fare better or worse when restricted to a particular subspecies. 


17.1. Truth Conditions for Indicative Conditionals 


An indicative conditional sentence, “If A, B,’ has two constituent sentences, of 
sentence-like clauses, A and B, called the antecedent and consequent respectively, Itis 
part of the task of compositional semantics to specify the meaning of a complex 
Sentence as a function of the meanings of its parts. The generally most fruitful and 
time-honored approach is to specify the truth conditions of the complex sentence as 
a function of the truth conditions of its parts. A semantics of this kind illuminates 
the question of the validity of arguments involving the complex sentences, given the 
conception of validity as necessary preservation of truth [see chapter 6], This section 
assumes that this approach ro conditionals is correct ~ that is, it assumes that 
conditionals have truth conditions. Let A and B be two sentences such as “Ann is in 
Paris’ and “Bill is in Paris.” Our question will be: Does ‘If A, B have simple, 
extensional, truth-functional truth conditions, as *A and B,’ “A or B° and ‘It is not 
the case that A’ do? That is, do the truth values of A and of B determine the truth 
value of *If A, B°? Or ate they non-truth: functional, like those of “A because B’, “A 
before BF, ‘It is possible that A"? That is, do the truth values of A and B, in some 
cases, leave open the truth value of “If A, B? 

‘The truth-functional conditional was integral to Frege’s new logic (1960). It was 
taken up enthusiastically by Russell (who called it ‘material implication’), Wittgenstein 
and the logical positvists, and it is now found in every logic text. It is the first 
theory of conditionals that students of philosophy encounter. Typically, it does not 
strike students as obviously correct: some have been known to paste the truth table 
for “if” above their bed as an aide memeire. It is logic’s first surprise. Yet, as the 
textbooks testify, it does a creditable job in many circumstances. And it has many 
defenders. It isa strikingly simple theory: ‘If A, B” is true iff (if and only if itis not 
the case that (‘A is true and *B? is false). It is thus equivalent to 


A&B) 


and to 








Conditionals 
AVE 


‘AD BY has, by stipulation, this truth condition. Our question is whether this is an 
adequate rendering of “If A, BY. 

It's casy to see that if “if” is truth-Functional, this particular truth function, depicted 
in column (i) of table 17.1, is the correct one. For sometimes “If A, B" is true when 
*A’ and ‘Bare, respectively, (true, true), oF (false, true), oF (false, false). For instance, 


If it’s a square, it has four sides. 


is true whether the unknown shape is a square, an oblong rectangle, or a triangle. 
Assuming truth-functionality, it follows that conditionals are alway true for these 
combinations of truth values of their parts. The remaining case, (truc, false), is 
Uunrealizable in this example, Assuming truth-functionality, the conditional must be 
false in this case; otherwise, there would be no such thing 38 fase conditional ~ all 
conditionals would be tautologics. This last case is the most obviously correct, 
anyway. If it were possible to have "A" true, “B" false, and ‘If A, B° true, it would be 
‘unsafe to infer ‘B’ from ‘A’ and ‘If A, B': modus ponens would be invalid, Taking for 
granted that modus ponens is valid for any interpretation of ‘If A, B* worth taking, 
seriously, any acceptable interpretation of ‘If A, B° must entail (A & B),ic,, A> B. 
‘There’ are different proposals for non-truth-functional truth conditions for ‘If 
A, BL am here concemed only with schematic features, I shall use “A> BY as a 
generic representation of any such conditional. On some interpretations, ‘A> BY 
differs from ‘A > B° only when A is false. For instance, Stalnaker’ (19916 [1968]) 
Proposal is of this type. I say 


(8) If you strike the match, it will light. 





If you do strike it, my remark is truc if it lights, false if it does not ~ in agreement 
with the truth-functional account. If you don’t strike it the truth-value of S, on this 
account, depends on whether the match lights in a non-actual possible world in 
which you do strike it, and which otherwise differs minimally from the actual world. 
Suppose that actually you don’t strike it, and there is a hurricane blowing, In the 
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world most like the actual word in which you strike it, it doesn’t light. S is false 
Suppose that actually you don’t strike it, and conditions are ideal for the lighting of 
matches. In the world most like the actual world in which you strike the match, it 
lights. $ is true. 

‘On other non-truth-functional interpretations, “A> B” may be false not only 
when Ais false, but also when A and Bare both true. For instance, if, in some sense 
of necessitate,” the truth of “A> B” requires that A necessitate B, A & B is not 
sufficient for A> B. I represent below a non-truth-functional account of Stalnaker’s 
kind, but the argument which follows applies to any non-truth-functional account. 

Let A and B be two logically independent propositions. The four lines in 
table 17.1 represent the four incompatible logical possibilities for the truth values of 
A and B. If A, BY If A, BY and “If A, “B? are interpreted truth-functionally in 
‘columns (i)-(ii), and non-truth-functionaly in columns (iv}-(vi). ‘T/F* means both 
truth values are open for the corresponding assignment of truth values to A and B. 
For instance, line 4, column (iv), represents two possibilities for A, B and A—> B, 
(F, B, 1) and (F, F, F) 

Column (i) may reasonably be said to specify the meaning of the truth-functional 
‘conditional, in terms of the meanings of A and of B. The meanings of A and of B, 
together with how the world is, determine their truth values. The four lines rep- 
resent four exclusive and exhaustive ways the world might be. Column (i) shows, 
‘whichever way the world is for A and B, how to determine the truth value of A> B. 
Column (iv) does not pretend to specify the meaning of a non-truth-functional 
conditional. Rather, once that meaning has been specified (e.g. in Stalnaker’s way), 
it follows that this array of logically possible combinations of truth values for A, B 
and A— B exists 





17.1.1. Arguments for truth functionality 


‘The main argument points to the fact that minimal knowledge that the truth- 
functional truth condition is satisfied is enough for knowledge that if A, B. In short: 
suppose there are two balls in a bag, # and &. All that is known about their color is 
that at last one of them is red. That’s enough to know that if a isn’t red, is red. 
‘Or: all that is known is that they are not both red. That’s enough to know that if a 
is red, b is not red. 

Suppose there is no information to start with about which of the four possible 
‘combinations of truth values for A and B obtains. Then compelling reason is acquired 
to think that A v B. There is no stronger belief about the matter, in particular, there 
is no firm belief as to whether or not A. Line 4 is ruled out; the other possibilities 
remain open. Supposing all that, then, intuitively, one is justified in inferring that 
if +4, B. Look at the possibilities for A and B on the left. The possibility that both 
‘A and B are false has been eliminated. So if A is false, only one possibility remains: 
Bis true. 

‘The truth-functionalist (call him Hook) gets this right. Look at column (ii). 
Eliminate line 4 and line 4 only, and the only possibilty in which “A > Bis fale 
has been eliminated. One knows enough to conclude that ‘=A > BY is true. 
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‘The non-truth-functionalist (call him Arrow) gets this wrong. Look at column 
(v). Eliminate line 4 and line 4 only, and some possiblity of falsity must remain in 
other cases which have not been ruled out. By eliminating just line 4, one does not 
ipso facto climinate these further possibilities, incompatible with line 4, in which 
‘A> Bis false. 

‘The same point can be made with negated conjunctions. Suppose for sure that 
~(A & B), but nothing stronger than that. In particular, one does not know whether 
(or not A. Line 1 is ruled out; nothing more. One may justifiably infer that if A, —B. 
Hook gets this right. In columa (i) if ine 1 is eliminated, this eaves only cases in 
which “A. -B*is true. Arrow gets this wrong. In column (vi), eliminating just line 
1 leaves open the possibility that “A> —B" is false. 

Intuitively, in evaluating “If A, 8," one supposes that A is true, That is, one 
supposes that line 1 or line 2 obtains. When it is known that line 1 does not obtain, 
‘one conclucles that line 2 obtains. If A is trac, Bis not. End of story. Arrow agrees 
that ‘A > ~B" is tne if line 2 obtains. But, for him, that is not the end of the story. 
For A may be false, and in this case, “A +B" may be false. Now there is something, 
‘counterintuitive about this part of his thought experiment: when considering whether 
AB is true if A is true, why should one have to bother to think about what is the 
case if A is false? 

‘The same argument renders compelling the thought that if one eliminates just 
A& ~B ~ nothing stronger, ic., on docs not eliminate A, then there is sufficient 
reason to conclude that if A, B. 

Hook's second argument is in the style of Natural Deduction. The three premisses 
(A&B), A and B ental a contradiction. So, by reductio ad absurdum, ~<A & B) 
and A entail 5B. So, by Conditional Proof (CP), A & B) entails ‘If A,B" 
Substitute ~C for B, and then, provided Double Negation Elimination is allowed, 
‘one has a proof of ‘If A, C" from A & —C). 

Conditional Proof raises no eyebrows. It seems sound. ‘From X and 7, it follows 
that Z” ‘From X, it follows that if T, Z’ The ‘f”-clause in the later sentence seems 
to function just as the second premiss in the former. Yet, for no reading of if” which 
it stronger than the truth functional reading it CP valid ~ atleast this i 90 if "8° and 
‘7 are treated in the classical way and hence one accepts the validity ofthe inference: 








(UY) HARB), APB 
‘Suppose CP is valid for some interpretation of “If A,B” Applying CP to (I) gives 
ARAB) EIE A,B. 


‘That is, AD BE IFA, B 


17.1.2. Arguments against truth functionality 


‘The best-known objection, one of the ‘paradoxes of material implication’ is that, 
according to Hook, the falsity of A is logically sufficient for the truth of ‘If A, BY. 
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{See chapter 13.] Look at the last ro ines of column (i). In every possible situation 
in which A is false, ‘If A, BY is true. Can it be right that the falsity of 


She ate the apple. 
entails the truth of 
If she ate the apple, she was ill 


Hook might respond as follows. How can intuitions about the validity of an 
inference be tested? The direct way is to imagine that one knows for sure that the 
remiss is true, and to consider what one would then think about the conclusion 
Now when one knows for sure that something, A, is truc, there is no place for 
thoughts beginning ‘If A is fase...” When one knows for sure that Harry did i 
‘one does not think or remark “If Harry didn’t do it...", In this circumstance, 
conditionals have no role to play, and one has no practice in assessing them. The 
direct intuitive testis, therefore, silent on whether “If A, B’ follows from —.A. If the 
smoothes, simplest, generally satisfictory theory has the consequence that it does 
follow, perhaps one should lear to live with this consequence, 

‘There may, of course, be further consequences of this feature of “D* which jar 
‘with intuition, That needs investigating. But, Hook may add, even if one concludes 
that ‘3° does not fit perfectly our natural language use of Sif,’ it comes close, and it 
has the virtues of simplicity and clarity. As has been seen, rival theories also have 
‘counterintuitive consequences. Natural language isa fluid affair, and theories cannot 
be expected to achieve better than approximate fit. Perhaps, in the interests of 
precision and clarity, in serious reasoning the untidy and unclear “if” should be 
replaced with its neat, close relative, “>.” 

This was no doubt Frege’s attitude. Frege's primary concern was to construct @ 
system of logic, formulated in an idealized language, which was adequate for math- 
‘ematical reasoning. If ‘A> B° does not translate perfectly our natura language ‘If 
A, BY, but plays its intended role, so much the worse for natural language. 

Perhaps, for the purpose of doing mathematics, Frege’s judgment was correct. 
‘The main defects of the truth-functional conditional do not show up in mathe 
matics. There are some peculiarities, but as long as one is aware of them, they can be 
lived with, And arguably, the gain in simplicity and clarity more than offsets the 
oddities. 

‘The oddities become less tolerable when considering conditional judgments about 
empirical matters. The difference is this: in thinking about the empirical world, 
propositions are often accepted and rejected with degrees of confidence less than 
‘certainty. 








I think, but am not sure, that A 


plays no central role in mathematical thinking. Perhaps, the use of indicative condi- 
tionals can be dismissed as unimportant in circumstances in which one is certain that 
the antecedent is false. But the use of conditionals whose antecedent is thought to 
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be likely to be false cannot be ignored. They are used often, some are accepted, 
others are rejected. 


I think I won't need to get in touch, but if I do, I shall need a phone number. 
you say, a8 your partner is about to go away; not 


If do, I'l manage by telepathy 
1 think John spoke to Mary; if he didn’t, he wrote to her, 


not 
If he didn’t, he shot her. 


Hook's theory has the appalling consequence that ail conditionals with unlikely 
antecedents are likely to be true. To think it likely that -.A is to think it likely that 
a sufficient condition for the truth of A > B obtains. Take someone who thinks the 
Republicans won't win the election, and who does not think that if they do win, 
they will double income tax (ic, he rejects that). According to Hook, this person 
thas grossly inconsistent opinions. Not only does Hook’s theory fit badly the patterns 
of thought of competent, intelligent people. It cannot be claimed that one would be 
better off with *D.’ On the contrary, one would be intellectually disabled: lacking, 
the power to discriminate between believable and unbelievable conditionals whose 
antecedent is thought to be likely to be false. 

Arrow does not have this problem. His theory explicitly avoids it, by allowing that 
‘A> BY may be false when A is fae. 

‘The other paradox of material implication is that, according to Hook, all condi- 
tionals with true consequents are truc: B+ A> B. This is perhaps less obviously 
‘unacceptable: if I'm sure that B, and treat A as an epistemic possibility, T must be 
sure that if A, B Again, the problem becomes vivid when considering the case 
where I'm nearly sure, but not quite sure, that B. I think B may be false; and will be 
false if certain, in my view unlikely, circumstances obtain. For example, I think that 
Fred is giving a lecture right now. I don’t think that if he was seriously injured on 
his way to work, he is giving a lecture right now. But on Hook’s account, the truth 
of the consequent isa logically sufficient condition for the truth of the conditional: 
the conditional is false only ifthe consequent is false. So on this account, no one 
can, without gross irrationality, think that the consequent is likely to be true, but 
the conditional is unlikely to be true. 


17.13. The pragmatic defence of truth-functionality 
Grice famously defended the truth-functional account, in his Wiliam James lectures, 


“Logic and Conversation,” delivered in 1967 (Grice, 1989). There are many ways 
of speaking the truth yet misleading your audience, given the standards to which 
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you are expected to conform in conversational exchange. One way is to say some- 
thing weaker than some other relevant thing you are in a position to say. Consider 
disjunctions. [ am asked where John is. I am sure that he is in the pub, and know 


that he never goes near libraries. Inclined to be unhelpful but not wishing to lie, 
Tsay 


He is either in the pub or the library 


‘My hearer naturally assumes that this is the most precise information I am in a 
position to give, and also concludes from the truth (let us assume) that I rold him 


If he’s not in the pub he's in the library 


‘The conditional, like the disjunction, according to Grice, is true if he's in the pub, 
bbut misleadingly asserted on that ground. 
‘Another example, from David Lewis (1986c {1976}, p. 143) 


‘You won't eat those and live. 


I say of some wholesome and delicious mushrooms - knowing that you will now 
leave them alone, deferring to my expertise. I told no lie ~ for indeed you don’t eat 
them ~ but of course I misled you. 

Grice drew attention, then, to situations in which a person is justified in believing 
4 proposition, which would nevertheless be an unreasonable thing for the person 10 
say, in normal circumstances. His lesson was salutary and important. He is, 1 think, 
right about disjunctions and negated conjunctions. Believing that John is in the 
pub, I can’t consistently disheliere 


He is either in the pub oF the libeary 


If 1 have any epistemic attitude to this proposition, it should be one of belief, 
however inappropriate itis for me to assert it. Similarly for 


‘You won't eat those and live, 
when I believe you won’t eat them. But the difficulties with the truth-functional 
conditional cannot be explained away in terms of what is an inappropriate con: 
versational remark. They arise at the level of belief: Thinking that John isin the pub, 
1 may without irrationality disbelieve 

IF he isn't in the pub, he’s in the library. 
Thinking you won't eat the mushrooms, I may without irrationality reject 


If you eat them you will die. 
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[As facts about the norms to which people defer, these claims can be tested. A good 
enough testis to take a co-operative person, who understands that you are merely 
interested in her opinions, as opposed to what would be a reasonable remark 10 
‘make, and note which conditionals she assents to. Are we really to brand as illogical 
someone who dissents from both 


‘The Republicans will win. 
and 
If the Republicans win, income tax will double? 


‘The Gricean phenomenon is a real one. On anyone's account of conditionals, 
there will be circumstances in which a conditional is justifiably believed, but is liable 
to mislead if stated. For instance, I believe that the match will be cancelled, because 
all the players have “fu. I believe that whether or not it rains, the match will be 
cancelled: if trains, the match will be cancelled, and if it docsn’t rain, the match will 
be cancelled. But 1 would mislead my audience by responding to a query about 
whether the match will be cancelled by asserting the first of these conditionals. This 
does not demonstrate that Hook is correct. Although I believe that the match will 
bbe cancelled, I don’t believe that if all the players make a very speedy recovery, the 
‘match will be cancelled 


17.14. Compounds of conditionals: Against Hook, and against Arrow 


{A B) is equivalent to A & 4B. Intuitively, one may safely say, of an unseen 
‘geometric figure, 


It's not the case that if it’s a pentagon, it has six sides. 


But by Hook's lights, one may well be wrong; for it may not be a pentagon. 
Another example, due to Gibbard (1981, pp. 235-6): Of a glass that had been held 
a foot above the floor, I say (having left the scene), 


If it broke if it was dropped, it was fragile. 


Intuitively this is reasonable. But by Hook’s lights, if the glass was not dropped, and, 
was not fragile, my conditional has a true (conditional) antecedent and false con 
sequent, and is hence false. Grice’s strategy was to explain why we don’t assert certain 
conditionals which we have reason to believe true. In these two cases, the problem 
is reversed: there are compounds of conditionals which we confidently assert and 
accept which, by Hook’s lights, we do not have reason to believe true. 
Disjunctions are alo troublesome. (A> B) v (A> B) is a tautology. If you 
deny that you will be upset if you are not promoted, you are committed to accepting 
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that you will be upsct if you are promoted. And here is my favorite truth-functionally 
valid, proof of the existence of God: 


IF God does not exist, it's not the case that if I pray, my prayers will be answered. 

1 do not pray. 

‘Therefore God exists. 

Arrow does fine with the above examples. Bur other cases of embedded condi 
tionals count in the opposite direction. Here are two sentence forms which are, 
intuitively, equivalent: 


(i) (A&B, C. 
(i) If A, then if B, C. 


‘Try any example: 


If Mary comes then if John doesn’t have to leave early we will play Bridge, 
If Mary comes and John doesn’t have to leave early we will play Bridge. 
If they were outside and it rained, they got wet. 

If they were outside, then if it rained they got wet. 


‘The intuitive case for Import-Export, as this equivalence has been called (McGee, 
1985), is as follows. Consider (i). In assessing this, one supposes that the antecedent, 
‘A, is true, and makes a judgment about the consequent under that supposition. As 
the consequent is also a conditional, one supposes that its antecedent, B, is also true. 
‘A judgment is then made about its consequent, C, under these suppositions, The 
thought-experiment is equivalent to that of making a judgment about C under the 
supposition that A & B. 

For Hook, Import-Export holds. Gibbard (1981, pp. 234-5) has proved that for 
no conditional with truth conditions stronger than “>" does Import-Export hold. 
Here is the proof, Let -> be any conditional connective for which Import-Export 
holds. Make two innocuous assumptions: 


(a) A Benuails ADB, 
(b)_ if A entails B, A> Bis a logical truth. 


Now consider the formula 
(ADB) >(A>B) 7a) 


By Import-Expor, itis equivalent to 





(ADB RAB (17.2) 
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‘The antecedent of (17.2) entails its consequent. So, by (b), (17.2) isa logical truth, 
So, by Import-Export, (17.1) is a logical ruth. The damage is done, but to drive it 
home: by (a), (17-1) entails 

(AD B)3(A>B) 73) 
So (17.3) isa logical truth. A truth-functional conditional is logical truth just in 
‘ase its antecedent entails its consequent. So (A > B) entails (A> B). So > is no 
stronger than > 

For instance, for Hook, 

IF it rains oF snows, then if it doesn’t rain, it will snow. 


is trivially true, equivalent to 





rains or snows, and doesn’t rain, it will snow. 
For Stalnaker, the former may well be false, and 

If it rains or snows, then if it doesn’t rain, it won't snow. 
may be true, while also 

If it doesn’t rain, then if it rains oF snows, it will rain, 


may be true (see below). 
The score is roughly even. Hook gets some things right, some things wrong. 
Similaely for Arrow. Can a perilous course be steered “twist Arrow and Hook? 


17.2. Conditional Belief and Conditional Probability 


Putting truth conditions aside for a while, consider what itis to believe, oF to be 
more of less certain, that B if A ~ that John cooked the dinner if Mary didn', that 
you will recover if you have the operation, and so forth. How do you make such a 
judgment? You suppose (assume, hypothesize) that A, and make a hypothetical 
judgment about B, under the supposition that A, in the light of your other beliefs. 
‘As Ramsey (1990a, p. 147) pur it: 


Ifewo people are anguing ‘If p, will @” and are both in doubt as to , they are adding 
hypothetically to their stock of knowledge, and arguing on that bass about they 
fre fixing their degrees of belief in q given p. 


When one is neither certain that B nor certain that xB, there remains a range of 
different epistemic attitudes one may have to B: one may be nearly certain that B, 





395 


Dorothy Edgington 


think B more likely than not, ete. Similarly, one may be certain, nearly certain, think 
it more likely than not, etc, that B, given the supposition that A. Make the idealizing, 
assumption that degrees of closeness to certainty can be quantified: 100 percent 
certain, 90 percent certain, etc.; and probability theory can be tuned to for what 
Ramsey called the ‘logic of partial belief” There one finds a well-established, in- 
dispensable concept, ‘the conditional probability of B given A.’ It is this notion to 
which Ramsey refers by the phrase “degrees of belief in g given p" [see chapter 16], 

‘The earliest statement I know of the basic law concerning conditional probabilities 
is in an essay by Thomas Bayes published posthumously in 1763: 


‘The probability that two vents will both happen is...the probability of the frst 
{rmukiplied by] the probability ofthe second om the mppstion th the fst happens 
[ony emphasis. 


[A simple example: a ball is to be picked at random, Of the balls, 70 percent are 
red (so the probability that a red ball is picked is 70 percent). Of the red balls, 
(60 percent have a stripe (so the probability that a striped ball is picked, on the 
supposition that a ed ball is picked, is 60 percent). The probability that a red 
striped ball is picked is 60 percent of 70 percent, ic., 42 percent. 

Ramsey, arguing that ‘degrees of belief” should conform to probability theory,? 
stated the same fundamental law of probable belief” (1990c [1931], p. 77): 


Degree of belief in (p and q) = degree of belief in px degree of belief in g given p 


For example, you are $0 percent certain that the test will be on conditionals, and 
£80 percent certain that you will pass on the supposition that it is on conditionals. So 
you are 40 percent certain that it will be on conditionals and you will pass 
‘Accepting Ramsey's suggestion that ‘if’, ‘given that’, ‘on the supposition that? 
come to the same thing, writing “p(-)” for ‘degree of belief in (~)', and ‘p,(-)’ for 
‘degree of belief in (-) given A’, and rearranging the basic law, gives: 


PBi€ A)= p4(B)= PAB B)/H(A) provided pA) #0 


Figure 17.1 shows a partition (a set of mutually exclusive and jointly exhaustive 
propositions): One’s degrees of belief in the members of a partition should sum to 
100 percent: that is all there is to the requirement that degrees of belief have the 
structure of probabilities, 





aaa aaa] 7) 














Figure 17.1 


Suppose a person X thinks it 50 percent likely that A (hence 50 percent likely 
that A), 40 percent likely that A & B, and 10 percent likely that A & “B. (Note that 
as |A, 7A) and [A & B, A&B, 4A] are both partitions, it follows that 





396 





Conditionals 
BA) = (A & B)+ (ARB) 


How does X evaluate “If A, 82” X assumes that A, that is, hypothetically eliminates 
“A. In the part of the partition that remains, in which A is true, B is four times as 
likely as 8; that is, under the assumption that A, it is four to one that B: p(B if A) 
is 80 percent, p(-sBif A) is 20 percent. Equivalently, as A & Bis four times as likely 
as A&B, p(B if A)=$, or 80 percent. Equivalently, p(A & B) is $ of p(A). In 
‘non numerical terms: you believe that if A, B to the extent that you think A & Bis 
nearly as likely as A; or, to the extent that you think A & Bis much more likely than 
A&B. If you think A & B is as likely as A, you are certain that if A, B. In this 
case, your p(A & +B) =0. 

Note: this thought-experiment can only be performed when p(A) #0. On this 
approach, indicative conditionals only have a role when the thinker takes A to be an 
epistemic possibility. If you take yourself to know for sure that Ann is in Paris, you 
don’t go in for “If Ann is not in Paris...” thoughts (though of course you can think. 
“If Ann had not been in Paris..."). In conversation, you may pretend to take 
something. as an epistemic possibilty, temporarily, to comply with the epistemic 
state of your hearer. When playing the skeptic, there are not many limits on what 
you can, at a pinch, take as an epistemic possibility ~ as not already ruled out. But 
there are some limits, as Descartes found. Is there a conditional thought that begins 
SIET don't exist now... 

‘On Hook’s account, to be close to certain that if A, B is to give a high value to, 
(A> B). How docs pA > B) compare with p,(B)? In two special cases, they are 
equal: first, when p(A & 8) =0 (and p(A) #0), p(A > B)= py(B)=1 (ie., 100 
percent), Second, if p(A)= 100 percent, (A > B) = p4(B) = p(B). In all other cases, 
WA B) > p(B). To prove the inequality, consider a picture of a partition, {A & B, 
A&B, A), as in figure 17.1, drawn to scale. Focus on p(A & “B). Provided 
HAK—~B)#0 and (A) #0, pA & —B) must be a smaller proportion of the 
‘whole space than itis of the part of the space in which A is true. So, except in these 
two cases, (A & vB) < p(B). It follows that, except in these special cases, 





MAD B)> PAB) 


MAK B)=1- ADB) 


PAAB)=1~ PB) 


(AD B) and p,(B) come spectacularly apart when p(A) is high and p{A 8 B) is 
much smaller than (A & ~B). Let 


P(A) =90 percent 
(A ®& B)=1 percent 
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P(A &B)=9 percent 
pa B)=10 percent 
MAD B)=91 percent 


For instance, I am 90 percent certain that Tom won't be offered the job, and think 
it only 10 percent likely that he will dectine the offer if itis made. 


pioler > decline) = p(not offer, or (offer and decline) =91 percent 


(Its sometimes useful, a8 a heuristic device, to imagine a partition as carved into 
a large finite number of equally-probable chunks, such that the propositions with 
Which we are concerned are true in an exact number of them. The probability of any 
rnon-conditional proposition is the proportion of chunks in which it is true. The 
probability of B on the supposition that A is the proportion of the A-cbunks (those 
in which A is truc) which are B-chunks. With some misgivings, 1 call these chunks 
‘worlds’: they are equally-probable, mutually incompatible and jointly exhaustive, 
epistemic possibilities ~ enough of them for the propositions with which one is 
‘concemed to be true, oF false, in each world. The heuristic value is that judgments 
Of probability and conditional probability can then be stated as judgments about 
proportions.) 

We can now compare Hook, Arrow, and our probability theorist whom we shall 
call Prob, with respect to two questions raised in section 17.1. 








Question 1 You are certain that -4A & ~B), but not certain that A. Should you be 
certain that if A, BP (Equivalently: if you are certain that (A & B), but not 
certain that ~A, should you be certain that if A, J? And: if you are certain 
that A v B, but not certain that A, should you be certain that if A, B?) 

Hook: Yes. Hecause (AD B) is true whenever A & is fase 

Pro Yes. Here, A & Bis just as likely a8 A. Bis tru in all my A-worlds. (3) = 1 

Arrne. —No, not necessatily. For A+ B nay be false when A & Bis false. 





Quorion 2 Ifyou think cel that A, might you sil thick it unlikely tat i , 

Hint” No. (A> By is tre in all che posable situations in which A i re. 1 
think it ikey thar thik ely that  saficent condon for the eth 
of (A > B) obtains. 1 must, therefore, think it likely that if A, B. 

Pro Yeu We had an camp shove. That mont of my probaly goct to A 
leaves open the queaton wether or aot A 8 Bis more proable tan A BR, 
IE f(A & a rete than AS By thnks ony cat A,B Tha’s 
compaiewith inking ic Wl that A 

‘Arrow. Yes (A+B) may be fe when A is be. And 1 ight well think i ely 
‘thar that possibility obtains. 








Prob has squared the circle: He gets the right answer to both questions. In this, 
he differs from both Hook and Arrow. Prob’s way of assessing conditionals is 
incompatible with the truth-functional way (they answer question 2 differently); 
and incompatible with stronger-than-truth-functional truth conditions (they answer 
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question 1 differently). It follows that Prob’s way of assessing conditionals is incom- 
patible with the claim that conditionals have truth conditions at all. p(B) does not 
measure the probability of the truth of any proposition. Suppose it did measure the 
probability of the truth of some proposition A+B. Either A+ B is entailed by 
AD Bor it is not. If itis, it is true whenever A is false, and hence cannot be 
improbable when “A is probable. That is it cannot agree with Prob in its answer to 
question 2. If A+ Bis not entailed by A> B, it may be false when A & “B) is 
tue, and hence certainty that (A & ~B) (in the absence of certainty that A) is, 
insufficient for certainty that A'* B: it cannot agree with Prob in its answer to 
question 1. This remarkable result was first proved, in a different way, by Lewis 
(1986¢ [1976] 

Although Prob and Hook give the same answer to question 1, their reasons are 
different. Prob answers ‘yes,’ nat because a proposition, A B,'is truc whenever 
A&B is false; but because B is true in all the worlds which matter for the 
assessment of ‘If A,B’: the A-worids. Although Prob and Arrow give the same 
answer to question 2, their reasons are different. Prob answers ‘yes,” not because a 
Proposition, A+ B, may be false when A is false, but because the fact that most 
worlds are ~A-worlds is irrelevant to whether most of the A-morlds are B-worlds. To 
judge that Bis true on the supposition thar A is truc, it turns out, is not to judge that 
something-or-other, A * B, is true, simpliciter 





172.1. Validity 


Adams (1965, 1966, 1975) gave a theory of the validity of arguments involving 
conditionals as construed by Prob. He explained something important about classi- 
cally valid arguments as well: that they are, in a special sense to be made precise, 
probability: preserving. This property can be generalized to apply to arguments with 
<onditionals, The valid ones are those which, in the special sense, preserve probability 
(or conditional probability. [See also chapter 16.) 

Firse consider classically valid (that is, necessarily truth preserving) arguments which 
do not involve conditionals. These are used in arguing from contingent premisses 
about which one is often less than completely certain. The question arises: how cer- 
tain can one be of the conclusion of the argument, given that one thinks, but is not 
sure, that the premisses are true? Call the improbability ofa statement one minus its 
probability. Adams showed this: if (and only if) an argument is valid, then in no 
probability distribution does the improbability of its conclusion exceed the sum of 
the improbabilites of its premisses. Call this the Probability Preservation Principle 
(rrr). 

‘The proof of PPP rests on the Partition Principle ~ that the probabilities of the 
members of a partition sum to 100 percent ~ nothing else, beyond the fact that if A 
entails B, p(A & +B) =0.* Here are three consequences* 


(1) If A entails 8, pA) = 2B) 
(2) pA B)= (A) + p(B) — A & B) = pA) + 9B) 
(3) For all n, p(A,v == vA) S lA) +--+ pA) 
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Suppose A,,..., A, + B Then 
ABE AA, Ys vA, 
Therefore 
AB) S ply) +--+ BAW) 


‘The improbability of the conclusion of a valid argument cannot exceed the sum of 
the improbabilties of the premisses. 

The result is useful to know: If you have two premisses of which you are at least 
99 percent certain, they entitle you to be at least 98 percent certain of a conclusion 
validly drawn from them. Of course, if you have 100 premisses each at least 99 
percent certain, your conclusion may have zero probability, That is the lesson of the 
‘Lottery Paradox.” Still, Adams’s result vindicates deductive reasoning from uncer- 
tain premisses, provided that they are not too uncertain, and there are not too many 
of them. 

So far, we have a very useful consequence of the classical notion of validity. Now 
Adams extends this consequence to arguments involving conditionals. Take a language 
with ‘and’, ‘or’, ‘not’ and “if” ~ bur with ‘if? occurring only as the main connective 
in a sentence. (We put aside compounds of conditionals.) Take any argument for- 
mulated in this language. Consider any probability function over the sentences of 
this argument which assigns non-zero probability to the antecedents of all condi- 
tionals ~ that is, any assignment of numbers to the non-conditional sentences which 
conforms to the Partition Principle, and to the conditional sentences which con- 
forms to Prob’s thesis: 


PBif A) = p(B) = pA & B)/p(A) 


Call the improbability of a conditional “If A, B,’ one minus p,(B). Define a valid 
‘argument as one such that there is no probability function in which the improbabil- 
ity of the conclusion exceeds the sum of the improbabilities of the premisses, And a 
plausible logic emerges, with rules of proof, 2 decision procedure, and provable 
‘consistency and completeness (Adams, 1975, 1998). 

Call a non-conditional sentence a factual sentence. If an argument has 2 factual 
conclusion, and is classically valid with “if” interpreted as ‘>,’ it is probabilistically 
valid. It was shown that in all distributions in which p(A) #0, p(A > B) = p,(B). So 
if 4 factual conclusion follows from a premiss interpreted as the weaker AD B, it 
follows from the stronger A= B (2s I shall write Prob’s conditional). But not all 
truth-functionally valid arguments with conditional conclusions remain valid on this 
interpretation of the conditional and this construal of validity. The premisses may 
‘entail the weaker A> B without entailing the stronger A => B 

Conditional Proof fils. Indeed, all departures from truth-functional validity can 
be traced to the failure of Conditional Proof. In the following list, the inference on 
the left is valid, its partner on the right, derivable by a step of Conditional Proof, is 
not. 
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Valid Invalid 
() ABrA BLA=B 

(2) AVBAArB AV BY aASB 

(3) (A&B), APB YAK B)+A=—B 
(Q) ASG AKBEC — ASCHAR BSC 
(5) ASBB2CGAHC A=BB=CHASC 
(6) A= BABE A A2 BE aB37A 


(7) A&BSCABHC  A&B2CAHB=C 





At fist sight itis puzzling that CP should fail on Prob’s understanding of “If”, He 
construe ‘if"-clauses as suppositions ~ assumptions. What is the difference between 
the role of a premiss on the left of the turnstile, and the antecedent ofa conditional 
fon the right of the turnstile? 

‘The antecedent of the conditional in the conclusion is indeed treated as an 
assumption, The premisses are not being treated as assumptions, but as beliefs; and 
‘moreover, as beliefs which need not be certainties. With arguments involving only 
factual statements, the difference does not matter. For these, a valid argument may 
be construed as 


(a) showing what follows necessarily from assumptions, 

(b)_ an argument which preserves certainty ~ certainty in the premisscs entitles you 
to certainty in the conclusion; 

(©) an argument which preserves high probability, in line with PPP. 


If an assumption is taken as a hypothetical certainty, (a) and (\b) stay in line for 
arguments with conditionals also. However, with arguments involving conditionals, 
(b) can be satisfied while (¢) is not. There are arguments which preserve certainty 
which do not preserve high probability, for example the “invalid” argument forms on 
the right above. Their premisses can be arbitranly close to 100 percent probable, 
their conclusion arbitrarily close to 0 percent probable. 

‘The logico- mathematical fact behind this isthe difference in the logical powers of 
“AIT and ‘Almost all.” Consider (4) on the right: strengthening of the antecedent. If 
all A-worlds are C-worlds, then all A & Bworlds are C-worlds. But we can have: 
almost all A-worlds are Cworlds, yet no A & Bworlds are C-worlds. I can be 
almost certain that if you strike the match, it will light, yet give zero probability 10 
its lighting if you dip itn water and strike it. Consider (5) on the right: transitivity. 
If all A-worlds are Beworlds, and all Bworlds are C-worlds, then all A-worlds are 
‘Coworlds. But we can have: all A-workds are B-worlds, almost all B-worlds are C- 
worlds, yet no A-world is a C-world; just as we can have: all kiwis are birds, almost 
all birds fly, yet no kiwis fly. Take an example from Adams (1966): 





If Jones is elected, Brown will resign (highly likely: almost all J-worlds are B- 
worlds). 
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If Brown dies before the election, Jones will be elected (all) D-worlds are J- 
worlds). 


bat nor 
If Brown dies before the election, Brown will resign. 


Someone might react as follows: “All I want of a valid argument is that it preserve 
certainty. I'm not bothered if an argument can have premisses close to certain and a 
conclusion far from certain, as long as the conclusion is certain when the premisses 
are certain.” Let us not argue about the word “valid.” Use it in such a way that an 
argument is valid provided it preserves certainty, if you wish. If your interest in logic 
is confined to its application to mathematics or other a priori matters, that is fine. 
Further, when your arguments do not contain conditional, if you have certainty: 
preservation, probability-preservation comes free. But if you use conditionals when 
arguing about contingent matters, then great caution will be required. Unless you 
are 100 percent certain of the premisses, the arguments on the right guarantee 
nothing about what you are entitled to think about the conchision, The line between, 
100 percent certainty and something very close is hard to make out: it is not clear 
how you tell which side of it you are on. The epistemically cautious might admit 
that they are never, or only very rarely, 100 percent certain of contingent conditionals, 
So it would be usefull to have another category of argument, the ‘super-valid” which 
preserves high probability as well as certainty. The arguments on the left are super- 
valid, those on the Fight are not.* 


17.3. Further Issues 


173.1. Belief-relasive propositions 


Adams's theory of validity emerged in the mid-1960s, “Nearest possible worlds’ 
theories were not yet in evidence, Nor was Lewis's result that conditional probabilit- 
ies are not probabilities of the truth of a proposition. (Adams expressed scepticism 
about truth conditions for conditionals, but the question was still open.) Stalnaker’s, 
(19916 [1968}) (also 1981 [1970]) semantics for conditionals was an attempt to 
provide truth conditions which were compatible with Ramsey's and Adams's thesis 
about conditional belief. That is, he sought truth conditions for a proposition 
‘A> B (bis notation) such that p(A > B) must equal p(B) (Stalnaker, 19916 [1968], 
pp. 33-4): 


‘Now that we have found an answer to the question, “How do we decide whether or not 
we believe a conditional statement” [Ramsey's and Adam's answer to] the problem is 
to make the transition from belief conditions to truth conditions; ... The concept of a 
posible world is just what we need to make the transition, since 2 possible world is the 
ontological analogue of 2 stock of hypothetical belief. The following ...is a first 
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approximation to the account I shall propose: Consider a possible world in which A is 
‘true and otherwise differs minimally from the actual world. ‘IA, them B? is true (fale) 
jus in case B is true (fale) in shat posible world. 


fan argument is necessarily truth-preserving, the improbability of its conclusion 
cannot exceed the sum of the improbabilities ofthe premisses. This was the criterion 
‘Adams used in constructing his logic. So Stalnaker’s logic for conditionals must 
agree with Adams's over their common domain.* And it does. The argument forms 
con the right above are invalid on Stalnaker’s semantics. Consider (4). The following, 
is possible: in the nearest world in which you strike the match, it lights; in the 
nearest world in which you dip the match in water and strike i, it does not light. So 
Strengthening fails.” 
‘Conditional Proof fils for Stalnaker's semantics. 


Av BAB 
is, of course, valid. But 
(*) Av BEA>B 


is not: it can be true that Ann or Mary cooked the dinner (for Ann cooked it); yet 
false that in the nearest world to the actual word in which Ann did not cook it, 
‘Mary cooked it. 

‘Stalnaker (1991a [1975]) tried to show that although (+) is invalid, it is neverthe- 
Jessa ‘reasonable inference’ when ‘A v Bis assertable, that is, when the speaker has 
ruled out 4A & +B, but A & B and A & 5B remain open possibilities. Indicative 
conditionals, he claims, are used only when their antecedents are epistemically 
possible for the speaker (here he agrees with Prob). Then comes the crucial claim: 
worlds which are epistemically pavible for the speaker count as claser to the actual world 
than thee which are not. All A & >B-worlds have been eliminated. Not all +A 8 
‘Bworlds have been eliminated. All the speaker's epistemically possible —A-worlds 
are B-worlds, So the closest A-world is a Bworld, *A > Bis true 

‘This makes the truth conditions of a conditional, €.g., 


If Ann didn’t cook the dinner, Bob cooked it. 


dependent on what the speaker believes, Al that is common to different utterances of 
‘A> B is that they say that a certain A-world is a Beworld. That is not news: 
provided that A and Bare compatible, some A-word is a Beworld. Which world is 
being said to be a B-world depends on the speaker's beliefs, With fixed meanings for 
A and B, there is no single proposition A > B, but a different one for each belief 
state: one might write A >,2, where ‘pis itself indexed to a person and a time. 
Earlier I argued as follows against non-truth-functionalist truth conditions: there 
are six incompatible logically possible combinations of truth values for A, B and 
“A B. You start off with no firm beliefs about which obtains. Now you eliminate 
A&—B, ic, establish Av B. That leaves five remaining possibilities, including, 
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two in which ‘A B” is false. So you cannot be certain that A — B. Stalnaker 
replies: you cannot, indeed, be certain that the proposition you were wondering 
about earlier is true. But in your new epistemic state, you express a new proposition 
bby °.4-> BY with different truth conditions, govemed by a new nearest relation, 
and you know that that new proposition is true. 

Disagreement and change of mind give way to equivocation. Suppose you and I 
start off knowing Av By C. You then eliminate C. You accept ‘If +A, B? and reject 
SEA, CT eliminate B. I accept ‘If +A, C’, and reject “If A, BY. 1 assent t0 a 
sentence from which you dissent, and vice versa. We do not disagree. We express 
different propositions, with different truth conditions, governed by our different 
epistemic states. Worlds which are near for me are far for you. 

‘Are belief relative truth conditions better than no truth conditions? They account 
for the validity of arguments; but Adams’s logic has its own rationale without them, 
‘They account for sentences with conditional constituents. But we saw, and will sce 
below, they sometimes give counterintuitive results. Do they escape Lewis's negative 
result? Although Lewis showed that there is no proposition A+ B such that 
KA + B)= p(B) in every belief state, he did not rule out that in every belief state 
there is some proposition or other, A * B, such that p(A * B)= p(B). Nevertheless, 
in the wake of Lewis, Stalnaker himself proved this stronger result, for his condi- 
tional connective: the equation p(A > B)= p,(B) cannot hold for all propositions 
in a single belief state, If it holds for A and B, one can find two other propositions, 
Cand. (truth-Functional compounds of A and 8) for which, demonstrably, it does 
not hold." Gibbard (1981, pp. 231-4) showed just how beliet-sensitive Stalnaker's 
truth conditions would be, and later, Stalnaker (1984) abandoned the claim that 
conditionals express belief relative propositions, writing ‘Ie follows that the conditional 

‘expresses one proposition when it is asserted, and a different one when it is 
denied? (1984, p. 110). 


173.2. Assertability 


Jackson holds that ‘If A, B° has the truth conditions of ‘A> B, ie. ‘vA B’; but 
it is part of its meaning that itis governed by a special rule of assertability. “If” is 
assimilated to words like ‘but’, ‘nevertheless’ and ‘even’. “A but B" has the same 
truth conditions as ‘A and BY, yet they differ in meaning: ‘but’ is used to signal a 
contrast between A and B, When A and B are true but the contrast is lacking, ‘A 
but B” is true but inappropriate. Likewise, 


Even John can understand this proof. 


is true when John can understand this proof, but inappropriate when John is a 
‘world-class logician 

In asserting ‘If A, BY the speaker expresses his belie that A> B, and also indicates 
that this belief is “robust” with respect to the A. In his early work Jackson (1979, 
1981) explained ‘robustness’ thus: the speaker would not abandon his belief that 
AD Bit he were to lea that A. This, it was claimed, amounted to the speaker's 
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having a high probability for A> B given A, ic., for +A v B given A, which is just 
to have a high probability for B given A. Thus, asertability goes by conditional 
probability, Robustness was meant to ensure that an assertable conditional is fit for 
‘modus ponens. Robustness is not satisfied if you believe A> B solely on the grounds 
that A. Then, if you discover that A, you will abandon your belief in A > B rather 
than conclude that B. 

Jackson came to realize, however, that there are assertable conditionals which one 
‘would not continue to believe if one learned the antecedent. I say 


If Reagan worked for the KGB, I'll never find out. 


(Lewis's example 1986b, p. 155). My conditional probability for consequent given 
antecedent is high. Bur if I were to discover that the antecedent is true, I would 
abandon the conditional belief, rather than conclude that I will never find out that 
the antecedent is true. So, in his later work, Jackson (1987) defined robustness with 
respect to A simply as p,(A > B) being high, which is trivially equivalent to p,(B) 
being high. In most cases, though, the earlier explanation will hold good 

What are the truth-functional truth conditions needed for? Do they explain the 
meaning of compounds of conditionals? According to Jackson (1987, p. 129), they 
do not. We know what *A > B’ means, as a constituent in complex sentences. But 
‘AD BY does not mean the same as ‘If A, B°. The latter has a special assertability 
condition. And his theory has no implications about what, ifanything, “if A, BY 
means when it occurs, unasserted, as a constituent in a longer sentence. Here his 
analogy with ‘but’ et. fails. “But” can occur in unasterted clauses: 





Either he arrived on time but didn’t wait for us, oF he never arrived at all 
(Woods, 1997, p. 61). It also occurs in questions and commands: 


‘Shut the door but leave the window open. 
Does anyone want eggs but no ham? 
‘But’ means ‘in contrast.” Its meaning is not given by an ‘assertabilty condition.” 
Do the truth-functional truth conditions explain the validity of arguments involving 
conditionals? Not in a way that accords well with intuition, as has been seen, though 
Jackson (1987, pp. $0-1) claims that our intuitions are ‘at fault here: we confise 
preservation of truth and preservation of assertability. Nor is there any direct evidence 
for Jackson’ theory. Nobody who thinks the Republicans won't win treats 


If the Republicans win, they will double income tx. 
as inappropriate but probably true, in the same category as 


Even Gédel understood truth-functional logic. 
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Jackson is aware of this. He seems to advocate an error theory of conditionals: 
‘ordinary linguistic behavior fits the false theory that there is a proposition A * B 
such that p(A * B) = p(B) (Jackson, 1987, pp. 3940). If this is his view, he cannot 
hold that his own theory is a psychologically accurate account of what people do 
when they use conditionals. Perhaps it is an account of how we should use condi- 
tionals, and would if we were free from error: we should accept that 


If the Republicans win they will double income tax. 


is probably true when it is probable that the Republicans will not win, Would we 
gain anything from following this prescription? On the contrary, we would deprive 
Ourselves of the ability to discriminate between believable and unbelievable condi- 
tionals whose antecedents we think false. 


17.33. Compounds 
Lewis (1986c [1976], p. 134) wrote, 


‘Adams has convinced me. 1 shall take it as established that the asserability of an 
‘ordinary indicative conditional A+ C does indeed go by the conditional subjective 
probability (2(C)]* 


Should we then deny that conditionals express propositions, with truth conditions? 
“L have no conclusive objection to [this} hypothesis,” he writes. But he states an 
‘inconclusive objection’ ~ that it ‘requires too much of a fresh start’ (p. 142): 


What about compound sentences which have such conditionals as constituents? We 
think we know how the truth conditions for compound sentences of various kinds are 
determined by the truth conditions of constituent subsentences, but this knowledge 
would be useless if any of those sulbsentences lacked truth conditions. Either we need 
‘new semantic rules for many familiar connectives and operators when applied to indica 
tive conditionals... or ese we need to explain away all seeming examples of compound 
sentences with conditional constituents 


Lewis argued that the truth conditions are truth-functional, but the assertabilty of a 
conditional, for Gricean reasons, goes by its conditional probability. In a postscript 
to.a reprinting of this paper, Lewis (1986b, pp. 152-6) abandons Grice’s in favor of 
Jackson’s explanation of why assertability goes by conditional probability. 

‘An account which requires a ‘fresh start,” however, is preferable to one which 
already has unacceptable consequences for compounds of conditionals. Grice focuses 
‘on what is needed to justify the aswetion of a conditional, beyond the belief that it 
is true. This is no help when it occurs, unasserted, as a constituent of a longer 
sentence. And as was seen above with negations of conditionals and conditional in 
antecedents, the problem is reversed: sentences are aserted which would be believed 
false if the conditionals had been construed truth-functionally. Jackson (1987, 
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. 129) explicitly denies that his theory implies that compounds of conditionals are 
meaningful 

Followers of Adams claim that when a sentence with a conditional subsentence is 
intelligible, it can be paraphrased by a sentence without a conditional subsentence 
For some constructions this can be done in a general, uniform way. For others, it 
‘can be done only in the presence of contextual clues as to what is meant. They point 
‘out that some constructions are rarer, and harder to understand, than is to be 
‘expected if conditionals have truth conditions. Why do we never hear sentences of 
the form ‘Hither, if A, B, or, if C, D (I don’t know which)"? Some are impossible to 
understand. Gibbard’s example, said of a conference: “If Kripke was there if Strawson 
‘was, then Anscombe was there.’ “Do you know what you have been told?” he asks 
(1981, p. 235.) 

“IFA, then if B, C” i to be paraphrased as “If A & B, then C.’ For to suppose that 
A, then to suppose that B and make a judgment about C under those suppositions, 
is the same as to make a judgment about C under the supposition that A & B. 
Consider this as applied to a problem raised by McGee (1985) with the following 
‘example. Before Reagan's int election, Reagan was hot favorite, a second Republi 
«can, Anderson, was a complete outsider, and Carter was lagging well behind Reagan, 
Consider frst 


(1) Ia Republican wins and Reagan does not win, then Anderson will win, 
‘As these are the only two Republicans in the race, (1) is unassailable, Now consider 
(2) Ifa Republican wins, then if Reagan does not win, Anderson will win. 
‘We read (2) as equivalent to (1), hence also unassailable, 

Suppose I am close to certain (say, 90 percent certain) that Reagan will win, and 
hence close to certain that 

(3) A Republican will win 

But I don’t believe 

(4) If Reagan does not win, Anderson will win. 

1 am less than 1 percent certain that (4). On the contrary, I believe that if Reagan 
doesn’t win, Carter will win. As these opinions seem sensible, we have a prima facie 
‘counterexample to modus ponens. I accept (2) and (3), but reject (4). Truth con- 
ditions or not, valid arguments obey the probability preservation principle. I am 
100 percent certain that (2), 90 percent certain that (3), but less than 1 percent 
certain that (4) 

Hook saves modus ponens by claiming that I must accept (4). For Hook, (4) is 
equivalent to 


Either Reagan will win or Anderson will win, 
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As I'm 90 percent certain that Reagan will win, I must accept this disjunction, and 
hence accept (4). Hook's reading of (4) is, of course, implausible. 
Arrow saves modus ponens by claiming that, although (1) is certain, (2) is not 
equivalent to (1), and (2) is almost certainly false. For Stalnaker, 


(3) If Republican wins, then if Reagan does not win, Carter will win, 


is truc. To assess (5), we need to consider the nearest world in which a Republican 
wins (call it w), and ask whether the conditional consequent is true at w. Atm, 
almost certainly, itis Reagan who wins. We need now to consider the nearest world 
to w in which Reagan does not win, Cal it’. In w’, almost certainly, Carter wins. 
Stalnaker’s reading of (2) is implausible; intuitively, we accept (2) as equivalent to 
(1), and do not accept (5). (On Stalnaker’s semantics, 











If Reagan doesn’t win, then if a Republican wins Reagan will win, 


is also true.) 

Prob saves modus ponens by denying that the argument is really of that form. 
‘A= B, A; s0 B*is demonstrably valid when A and Bare propositions. For instance, 
if p(A) =90 percent and p,(B)=90 percent, the lowest possible value for p(B) is 
81 percent, The ‘consequent’ of (2), 


If Reagan doesn’t win, Anderson will win, 
is not a proposition. The argument is really of the form 
If A& B, then CG, A; s0 if B then C. 


‘This argument form is invalid (Prob and Stalnaker agree). Take the case where 
C= A, and we have 


If A & B then A; A; so if B then A. 


‘The first premiss is a tautology and falls out as redundant; and we are left with ‘A; 
40 if Bthen A.” We have already seen that this i invalid: I can think it very likely that 
Fred is lecturing right now, without thinking that if he was injured on his way to 
‘work, he is lecturing right now. So Prob is not at a unilateral disadvantage when it 
‘comes to compounds of conditionals. 


1734. Other conditionals 
[As well as conditional beliefs there are conditional desires, hopes, fears, etc. As well 


as conditional statements, there are conditional commands, questions, offers, promises, 
bets, etc. A conditional clause, “If he phones,’ plays the same role in 
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If he phones, what shall I say? 
If he phones, hang up immediately. 

and 
If he phones, Mary will be pleased. 


Which of our theories extends to these other kinds of conditional? 

‘Acconding to Prob, one believes that B to the extent that one thinks B more likely 
than not B; one believes that B if A to the extent that one thinks A & B more 
likely than A & ~B, and there is no proposition X such thar one must believe X 
‘more likely than -.X, just to the extent that one believes A & B more likely than 
A&B. Conditional desires appear to be similar: to desire that Bis to prefer B to 
AB; to desire that B if A is to prefer A & B to A & WB; there is no proposition X 
such that one prefers X to —X just to the extent that one prefers A & Bto A & 4B. 


If Mary comes (4M), I want her to meet Jane (J). 


1 prefer M&J to M& J. 1 do not necessarily prefer MD J, i., Mv J, t0 
‘M&~J. For I may also want Mary to come, and fear that the likeliest way of 
Mv J being truc is -M. Nor will my conditional desire be satisfied if in the nearest 
possible world in which Mary comes, she meets Jane. Ifthe elusive Jane happened to 
be here when Mary had almost arrived, but then received an urgent call to return to 
work, which turned out to be a mistake, a wrong number, meant for someone else, 
1 will not be pleased, 

If believe that Bif A, i¢., think A & B much more likely than A & -B, then this 
puts me in a position to make a conditional commitment to B: to assert that 2, 
conditionally upon A. If A is true, my conditional assertion has the force of an 
assertion of B. If A is false, there is n0 proposition that I asserted. I did, however, 
express my conditional belief itis not as though I said nothing." I say 





If you press that switch, there will be an explosion. 


My hearer takes me to have made 2 conditional assertion of the consequent, one 
which will have the force of an assertion of the consequent if she presses the button 
Provided she takes me to be trustworthy and reliable, she thinks that if she presses 
the switch, the consequent is likely to be true. That is, she acquires a reason to think 
that if she presses it, there will be an explosion; and hence a reason not to press it 

Conditional commands can, likewise, be construed as having the force of a com- 
‘mand of the consequent, conditional upon the antecedent’s being true. The doctor 
says to the nurse in the emergency ward, 


If the patient is still live in the moming, change the dressing 





Considered as a command to make Hook's conditional true, this is equivalent to 
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‘Make it the case that either the patient is not alive in the morning, or you change 
the dressing. 


‘The nurse puts a pillow over the patient’s face and kills her. On the truth-functional 
interpretation, the nurse can claim that he was carrying out the doctor's order. 
Extending Jackson's account to conditional commands, the doctor said “Make it the 
«ase that cither the patient is not alive in the morning, or you change the dressing, 
and indicated that she would still command this if she knew that the patient would 
be alive. This does not help. The nurse who kills the patient still carried out an 
order, Why should the nurse be concerned with what the doctor would command in 
4 counterfactual situation? Extending Stalnaker’s account to conditional commands, 





fit rains, take your umbrella. 
becomes 
In the nearest possible world in which it rains, take your umbrella. 


Suppose I have forgotten your command or altematively am inclined to disregard it. 
However, it does not rain, In the nearest world in which it rains, 1 do not take 
my umbrella, On Stalnaker's account, I disobeyed you. Similarly for conditional 
promises: on this analysis, I could break my promise to go to the doctor if the pain 
kets worse, even if the pain gets better. This is wrong: conditional commands and 
Promises are not requirements on my behavior in other possible worlds, 

Among conditional questions one can distinguish those in which the addressee is 
presumed to know whether the antecedent is true, and those in which he is not. In 
the latter case, the addressee is being asked to suppose that the antecedent is true, 
and give his opinion about the consequent 


IF it rains, will the match be cancelled? 


In the former case ~ ‘IF you have been to London, did you like it?” ~ he is expected 
to answer the consequent-question if the antecedent is true. If the antecedent is 
false, the question lapses: there is no conditional belief for him to express, ‘Not 
applicable’ as the childless might write on a form which asks 


If you have children, how many children do you have? 


(Perverse childless undergraduates have been known to write 17" on the grounds 
that ‘I have children 3 I have 17 children’ is true.) You are not being asked how 
many children you have in the nearest possible world in which you have children, 
\Nor are you being asked what you would believe about the consequent if you came 
to believe that you did have children. 

Probability theory needs the notion of conditional probability, which is not the 
probability of the truth ofa proposition (nor the probability of the occurrence of an 
event). It supplies us with an account of conditional belief, or better, degree of 
closeness to conditional certainty. Adams showed how to do logic for conditionals 
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in these terms. Widening our perspective, this thesis fits a general pattern. Any 
propositional attitude can be held simplicter, or under a supposition. Any speech act 
can be performed unconditionally, or conditionally upon something else. The phe- 
nomena are better explained without invoking conditional propositions. 


Suggested further reading, 


First, here are some essays which aim to give an overview of work on conditionals, as well as 
presenting the authors’ own conclusions. Mackie (1973, ch. 4), is of this type. Sanford 
(1989) inchides the history, and pre-history, of the subject, and a thorough and thoughtful 
critique of contemporary work, as well as original proposals. Edgington (1995) is a ‘State of 
the Am survey commissioned by Mind. Woods (1997) isan insightfal extended essay of this 
‘ype, followed by a commentary by Edgington. Next, Jackson (1991) isa useful collection of 
articles, many of them classics. Harper et al. (1981) is also a valuable collection of articles, 
‘many of them concemed with the probabilistic approach and with reactions to Lewis's (1986¢ 
[1976}) proof that a conditional probability is not the probability ofthe truth of a proposition. 
‘Adams (1975) isthe lens clasicus forthe logic Adams developed based on the probabilistic 
‘approach. This is further explained in Adams's (1998) textbook. Jackson (1987) isa sophis- 
ticated defence ofthe teuth-funetional account of the truth coodiboas of indicative condition: 
als, which argues that such conditionals are assertable to the extent that the conditional 
probability of consequent given antecedent is high. The classic writings on counterfactual 
conditionals are Goodman (1995, ch. 1), and Lewis (1973). Lewis (1986s [1979]) provides 
further elucidation of the notion of similarity between posible worlds, in response to critics 
‘of that notion. It is reprinted, with extensive postscripes, in Lewis (1986b). There is a huge 
‘number of articles on this still controversial subject. More complete bibliographies are found 
in Sanford (1989), Edgingzon (1995) and Woods (1997). 


Notes 


1 Dudman (1984a,b, 1988) has done most 0 promote this view. Sec also Gibbard (1981, 
‘pp. 222-6), Smiley (1984), Bennett (1988), Mellor (1993) and Woods (1997). Bennett 
(1995) recanted and returned to the traditional view. Jackson (19982 (1990]) also 
argues for the traditional view. 

2. Probability theory is an abstract structure which can have more than one interpretation. 
Rarmsey did not claim that the only interpretation of probability was degree of belt. 

3 Aswe are here concemed only with simple entalments (and we are interpreting probability 
‘8 degree of belief), we need only claim tha if A is recognized to entail B, (A & —B) = 0, 
leaving aside the question what someone should believe about A & Bi A entails B but 
the entailment is so complex a8 to be beyond hhs ken. 

4. (I) and (2) fallow from the following facts: 


+ A)=HAK B+ HARB) 
2B)= PAK B)+ KA B) 
+ WA entails B AR B)=0 
= HAV B)= MAS B)+ ASB) + ABB) 


(3) is proved by mathematical induction. We have seen that (3) holds for m= 2. Assume 
that it holds for n= m. Then it holds for m= m+ 1: 
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HUA,Y VAD) Aa) = ALYY) + PAs) SAL) + + AQ) + PAu) 


5 The arguments on the right do preserve high probability ia many normal applications. 
‘The counterexamples arse in clearly delimited circumstances. Sometimes we can state a 
restricted form which is valid. For example, restricted transitivity: A= B, (A & 8) => 
Ch A= Gis valid 

(6 Adams does not teat of compounds of conditional. As Stalnaker's conditionals have truth 
‘onkitions, there no peablem i prociple about thei oscuming as pas of longer sentences. 

7 A semantics ofthis kind is justly famous for counterfactual conditional. Stalnaker applied 
it to both indicatives and counterfactuals, while allowing that ‘neamess’ may be con: 
strued differently in the two cases. Lewis (1973) independently, developed a similar 
semantics for counterfactual (Lewis's logic also agrees with Adams's.) The idea has 
more intuitive plausibility for counterfactuals than for indicatives. The semantics agrees 
with the tuth-funetional semantics when A is true, and comes into its own when A is 
false, This i where the focus is for counterfactual thoughts, “If Mary had been there. 
these are typically asertions about non-actual situations, and we try to decide what 
‘would be true in them. It is not where the focws is for indicative conditional thoughts — 
“IF Mary is there... Ofcourse the antecedent ofan indicative conditional may be fale 

But i is not part of the process of its evaluation to think: suppose Mary isn’t there, then 

is “If Mary is there, C° true Is the nearest world in which she is there a C world? 

8 The proof is in Stalnaker’s letter to van Fraasen published in van Fraasen (1976, 
pp. 303-4); abo Gibbard (1981, pp. 219-20) and Edgington (1995, pp. 276-8). 

‘9 In his eariest writing om the subject, Adams di state his thesis in terms of ‘asserability’ 
‘but he abandoned the term as potentially misleading, and construes his thesis as about 
the acceptability, or believabiity, of a conditional. 

10 See Appiah (1985, pp. 205-10), Gibbard (1981, pp. 234-8), Fdgington (1995, 
pp. 280-4), Woods (1997, pp. 58-68 and pp. 120-4); ao Jackson (1987, pp. 127- 
37), Dummett (1973, pp. 351-4}; Dummett (1992, pp. 171-2), Mackie (1973, p. 73), 
‘There have alo been sophisticated attempts at a general theory of compounds, compatible 
with Prob’s thesis, involving quasi-propositional entities which take more than two 
values. One tradition makes the conditional tru if A&B, abe if vB, neither if-vA, 
and the probability of a conditional the probability that itis true given that it has atrath 
value, A valuable survey ofthis work is found in Milne (1997). Another tradition uses 
belief relative ‘propositions’ which are assigned 1 if A& B, 0 if AB, p(B) if A, 
and makes the probability of the conditional its expected value. See van Fraassen (1976), 
MeGee (1989), Jeffrey (1991), and Stalnaker and Jeffery (1994). Unfortunately, these 
approaches still generate counterintuitive consequences for compounds. For some crit 
«isms, see Fdgington (1991, pp. 200-2), and Lance (199) 

11 Dummett (1992, p. 115) misrepresents the notion of a conditional assertion when he 
‘ays itis “aif [someone] handed his hearers a sealed envelope marked ‘Open only in the 
event that... 
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Chapter 18 
Negation 
Heinrich Wansing 


18.1. Introduction 


‘This chapter is concemed with logical aspects of negation, ie. with the role of 
negation in valid inferences and hence with the contribution negation makes to the 
truth and falsity conditions of declarative expressions, Negation is an important 
philosophical and logical concept. Often differences between logical systems can ~ at 
least partially ~ be described as dferences between the notions of negation used in 
these logics. In natural deduction, for example, clasical logic can be obtained from 
intuitionistic logic on the addition of the double negation elimination rule: 


anA/A 


Notwithstanding the importance of negation, the immense literature on negation’ 
abounds with disagreement. According to Gabbay (1988), intuitionistic negation is 
4 typical instance of negation as inconsistency, while according to Avron (1999), 
intuitionistic negation clearly fails to be a genuine negation. In the opinion of 
‘Tennant (1999), basing negation on the notion of disproof leads to negation in 
intuitionistic relevant logic, whereas by treating the notion of disproof on a par 
with the notion of proof, Lopez-Escobar (1972) obtains the strong, constructive 
negation of Nelson (Almukdad and Nelson, 1984; Nelson, 1949; 1959), and inde- 
pendently investigated by von Kutschera (1969). Moreover, it can be shown that 
the latter negation fails to be a negation as inconsistency in the sense of Gabbay. 
‘The question arises: What are at least necessary conditions under which a unary 
connective ought to be regarded as a negation operation? The disagreement about 
negation is, however, even more fundamental. According to Zwarts (1996), for 
instance, linguistic research has clearly revealed that negation in natural languages 
‘occurs in various syntactic categories, including sentence negation However, 
Englebretsen (1981), for example, argues to the effect that whereas one ought to 
draw the Aristotelian distinction between predicate and predicate term negation, 
there is no such thing as external, sentential negation. Thus there is even disagrec- 
ment concerning the syntactic types to which negation belongs. 
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‘Various strategies are available to obtain a single, uniform account of the multi- 
plicity of syntactic types of negation or to reduce in the formal analysis the number 
of syntactic types in which negation occurs in natural languages: 


Elimination: to explain away certain syntactic types 


Generalization: to give an account such that various syntactic types of negation 
‘emerge as special cases of a general construction 


Representation: to represent one type of negation in terms of another type 


In section 18.2 instantiations of the frst two strategies will be dealt with. Section 
18,3 is then devoted to representing negation by means of unary connectives with 
a special emphasis on motivating, strong, constructive negation. Moreover, various 
approaches toward defining and classifying. notions of sentential negation will be 
surveyed. Some summarizing general ideas are assembled in section 18.4. 


18.2. The Syntactic Categories of Negation 


In the literature on syntactic categories in natural languages, it has sometimes been 
suggested to treat the Boolean particles ‘not’, ‘and’, and ‘or’ as variably polymorphic 
‘operations. In the cate of nor’, this means that for amy syntactic type (including 
sentences), ‘not’ may be combined with an expression of that type to form a com: 
pound expression of the same type; sec, for example, van Benthem (1991, pp. 26f. 
and ch, 13), However, there is no general consensus on this variable polymorphism. 
‘The proponents of a neo-Aristotelian term logic, for example, have called into ques- 
tion the nowadays orthodox view that negation is a unary sentential connective. 
‘Most explicitly, Englebretsen (1981) has tried to explain away sentential negation, 
and the present section is, among other things, concerned with a critical examina: 
tion of this eliminative view. 


18.2.1. The neo-Avistotelian elimination of sentence negation 


According to the Aristotelian term logic (as presented, for example, in Englebretsen 
(1981), Hom (1989, ch. 1), and Sommers (1982), every sentence consists of exactly 
‘one subject and exactly one predicate. Both the subject and the predicate are pos- 
sibly complex terms. In the sentence 


John is pleased. 


the expression ‘John’ is the subject, the expression ‘is pleased” is the predicate, and 
the expression ‘pleased’ is the predicate term. As Sommers (1982, p. 287) explains, 
“There is no reason to factor the predicate into a part that is the predicate term and 
a part that is the copula” “s’. If “the terms are not explicit, the traditional logician 
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could be regimented as ‘Socrates is a runner’.” Every sentence affirms or denies what 
is denoted by its predicate of what is denoted by its subject. While 


John is pleased 
affirms pleased of John, 
John is not pleased, 


denies pleased of John. This form of negation is called predicate negation (or pre- 
dicate denial) and is to be distinguished from predicate term negation. In the case of 
predicate term negation, a predicate term is negated to obtain another predicate 
term. The predicate term negation of ‘pleased’ for instance is ‘not-pleased’, and the 
sentence 


John is not pleased. 


affirms the predicate ‘not-pleased’ of John. If the predicate term of a sentence 
is negated, this results in a contrary of that sentence, A pair of contrary sentences 
cannot both be truc. Whereas a predicate term ‘P* may have many contrares, 
according to the neo-Aristotelian term logician, it has exactly one logical contrary, 
‘namely ‘not-P* (or ‘non-F"). Among the noo-logical contraries of the predicate 
term ‘ancient’, for example, are ‘medieval’ and ‘modern’. If the predicate of a 
sentence is negated, one obtains a contradictory of that sentence. A pair of contradic: 
tory sentences can neither both be false nor both be true, Whereas the predicate 
‘term negation of a sentence implies the predicate negation of that sentence, the 
converse is not truc. In this sense, predicate term negation is stronger than predicate 
denial. 

Actually, the distinction between logical and non-logical contrarics is quite subtle, 
‘There are passages in Aristotle's writings suggesting that contrariety is a polar notion 
(Hor, 1989, p. 37ff) presupposing two extreme points of a scale: 


Since things which differ from one another may do 40 to a greater or a less degree, 
there exists also a greatest difference, and this {call ‘cootrariety’. (Aristotle, Metaphyis 
1055a17-28) 


Hom (1989, p. 39) distinguishes between 
contrariety simpliciter 


‘immediate (alias strong or logical) contrariety, and 
‘mediate (alias weak or non-logical) contrariery. 





If by the span of a predicate term *P* one means the class of entities that can be 
either P or not-P, two predicate terms are contraries simpliciter of each other iff 
‘and only if) both have the same span and what they denote cannot both be true 
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‘of any element from their span. If two contrary predicate terms ‘P* and ‘Q” are such 
that for any a from their shared span, the sentences ‘a is P” and ‘a is Q' form a pair 
of contradictory sentences, the terms are said to be immediate contrarics. This is the 
«ase if the scale associated with *P* and *Q” is binary. As Sommers (1982, p. 168) 
puts it, “[a] pair of logical contraries exhausts a range of predicabilty.” When 
affirmed of numbers, ‘even’ and ‘odd’ form a pair of immediate contraries. Mediate 
contraries are contraris that are not immediate. If two mediate contraries denote 
the extremes of a (more than binary) scale, they are said to be polar, and otherwise 
they are called simple. While, for instance, “black” and ‘white’ are lexicalized polar 
contraries, ‘black’ and ‘red” are simple contraries. Any natural language predicate 
term has at most one polar contrary (with respect to a given scale). Also the 
immediate contrary of a given term is unique if it exists? It would also make sense 
to classify immediate contraries as polar, if being polar is understood in the more 
general sense of forming the extremes of an at least binary scale. Both immediate 
and mediate non-simple contraries would then be polar in this more general sense. 

‘The situation is more complicated if categorically mistaken sentences are also 
taken into account. Whereas ‘John is well’ and ‘John is ill’ are contradictories 
because ‘well’ and “ill” are immediate contraries (at least according to Aristotle and 
Horn), ‘2 is well” and “2 is ill” fail to be contradictories, since neither of these 
categorially mistaken sentences is true. Likewise, if the name ‘John’ is non-denoting, 
then neither ‘John is well” nor ‘John is ill’ are truc.* 

‘The term lepical contrary suggests a correlation with some systematic syntactic 
device that ~ at the level of formalization ~ goes beyond merely pairing representa- 
tions of lexical items like ‘even’ and ‘odd.’ Englebretsen (1981), Sommers (1982), 
and Hom (1989) use the predicate term negation ‘not-P" (‘non-P") to form the 
immediate (alias logical) contrary of a predicate term *P*, The question arises how 
this is related to the morphology and the lexical repertoire of natural languages. 
Predicate term forming prefixes lke ‘dis-, ‘un’, and ‘im-” do not in general generate 
immediate contraries but often map a predicate term to its polar non-immediate 
contrary." A person needs, for instance, neither to be pleased with a certain situation 
nor to be displeased with that situation, She might just not care. Nor need a person 
be cither happy or unhappy. According to Englebretsen (1981, p. 50), category 
mistakes are the only sources of violations of the Law of Bivalence: “Every sentence 
which is sensible, category correct, is true or false.” But as has just been seen, 
predicate term negation may also violate the Law of Bivalence in categorically correct 
sentences 

Since general polar contrariety is a more general concept than immediate contra- 
riety insofar as polar contraries are immediate if the associated scale is binary, and 
since, moreover, the predicate term forming morphemes ‘dis’, ‘un-’, and ‘im:” 
often yield mediate non-simple contraries, it scems justified to use ‘not-P* (‘non-P") 
to form the unique polar contrary of a predicate term *P” (in the sense of Horn) and 
‘not its immediate contrary. A further justification for this convention is that, in the 
case of categorically correct sentences, there is no need to distinguish between 
denying a predicate and affirming its immediate contrary, so that, for categorically 
correct declarative sentences, forming predicate term negation in a canonical way 
may be seen to yield unique polar contraries. 
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In an attempt to explain away sentence negation, as a first step, Englebretsen 
(1981, p. 36) attacks the view that every negation is sententil: 


“Modern sentential logiians. il to recognize anything but sntential negation. For 
such logicians any negated term must somehow be rendered part ofa negated sentence. 
Nevertheless, notwithstanding what might look like notational economy, there are 
some sentences which cannot be analyzed in such a logic — sentences which are true and 
make use of term negation, but which become false or semscless when term negation is 
replaced by sentential negation. An example of such a sentence is "Numbers are neither 
coloured nor noacoloured (uncoloured)'- Note that what is noncoloured is colourless, 
transparent, invisible, etc. Helium is noncoloured but not the number 2. The mathe: 
‘matical logician reparses the frst as ‘All things which are numbers are nether coloured 
nor noncoloured’; then as ‘Everything is such that if it is a number then it is not 
coloured and not noncoloured’, then as “Everything is such that if tis a number then 
itis not the case that itis coloured and itis not the case that itis not the case that it 
is coloured’, and finally a6 *Ws( Ne (Cea Cx)’ But in the usual fist order 
predicate calculus this entails that nothing is a number! (notation adjusted) 





However, this argument clearly fails to show that there are sentences which cannot 
bbe represented using sentential negation. Instead, it demonstrates that using a single 
sentential negation operation to represent but predicate negation and predicate 
term negation may have undesirable consequences, Using the unary connective ‘- 
to represent predicate negation and the unary connective *~" (strong negation) t0 
represent predicate term negation, the translation of Englebretsen’s example into 
predicate logic is: 


Wx(NeD (aCxa4~Cx)) 


‘Everything is such that if it isa number then it is not the case that it is coloured and 
it is not the case that itis non-coloured’. 

In a second step, Englebretsen intends to show that “what is negated is never a 
sentence” (1981, p. 45): 


[When ‘pis ‘Sis P*, then “I is not the case that p (‘is *S is not P*. Thus, ‘Tis 
‘not the case that 9 i even” is“ is not even’. But sometimes “itis not the case that” can 
be read as “its untrue tha’. For example, ‘It isnot the case that Ihave stopped beating, 
my wife’ docs not mean ‘I have not stopped beating my wife’ but rather ‘It is untrue 
thar I have stopped beating my wie’. In other words “it is not the case that? is 
ambiguous. Usually itis the ‘not’ of predicate denial. Sometimes it i the predicate 
‘untrue’. The mathematical logician always takes it in the second way since he does not 
recognize predicate denial. (1981, p. 476) [notation adjusted] 








He then continues to explain that “[i)f ‘it is not the case that’ is usually a sign of 
denial and sometimes a metalinguistic predicate (viz. ‘untruc’), then it seems we are 
well on our way to the position that no negation is sentential™ (1981, p. 49). First, 
‘one might object that 


Ics not the case that [ have stopped beating my wife. 
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is itself ambiguous and may well mean that I have not stopped beating my wife. 
Second, if the external ‘it is not the case that’ may be read cither as predicate nega- 
tion or as a meta-linguistic predicate that can be affirmed or denied of propositions, 
‘one might also draw the conclusion that predicate denial and this meta-linguistic 
predicate are to be understood as external, sentential negation. 

This is not the place for 2 general discussion of Aristotelian and neo- Aristotelian 
term logic, What the term logicians correctly point out is that a distinction must be 
drawn between predicate negation and predicate term negation. A natural idea is to 
represent these forms of negation by distinct unary connectives. Obviously, such a 
representation abstracts away from the innersentential syntactic realization of pre- 
dicate negation and predicate term negation in natural languages. The representing 
connectives can be iterated and therefore, for instance, be interpreted as algebraic 
‘operations. To formally take into account the distinction between predicate negation 
and predicate term negation, itis therefore interesting to investigate algebraic struc- 
‘ures comprising algebraic counterparts of at least two sentential negations. 





18.2.2. Generalization on the basis of 2-negasion algebras 


Yer there must be a way to link the internal operators on predicates with the external 
‘operator (operators) on propositions, a bridge between the logic of terms and the logic 
of propositions. (La Palme Reyes etal, 1994, p. 50) 


La Palme Reyes etal. (1994, 1999) develop a category-theoretic model for a pair of 
negations that can be applied to both predicates and sentences, “thereby incorpora- 
ting in a single context both term logic and propositional logic” (1994, p. 51). This 
instance of the generalizing approach is based on the notion of a 2-negation algebra, 
‘A 2-negation algebra is a bounded distributive lattice with two unary operations 
(G, negation, and —, supplement) 


(B, 5, v, 4, 0,1, 5, -) 
While negation is required to satisfy 

ySawill axed as.) 
supplement must satisty 

nes yiffevy=1 (as.2) 
La Palme Reyes et al. then use - to represent predicate term negation and use ~ to 
represent predicate negation.* This choice is surprising, because (18.1) defines the 
pseudocomplement of Heyting algebras. [See chapter 11.] The pseudo-complement 
is the algebraic counterpart of intuitionistic negation, which is a contradiction form- 
ing operation and not a contrary forming one. 


‘The approach by La Palme Reyes etal. is not presented here in any detail, because 
it does not account for a certain important difference in terms of inference patterns 
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between predicate negation and predicate term negation. Whereas predicate nega- 
tion satisfies contraposition as a rule, predicate term negation in general does not. If 
the predicate term of a sentence is negated resulting in the polar contrary of that 
sentence but not in its contradictory, then contraposition fails. If from 

John is pleased (in his work). 
ican be derived that 

Jack is pleased (in his work). 
it may nevertheless be the case that 

John is displeased (in his work), 
cannot be derived from 

Jack is displeased (in his work) 
because it may be that Jack is displeased and John is neither pleased nor displeased. 
However, it is well-known that any pseudocomplement — in a bounded lattice 
satisfies? 

aS yiff ys (18.3) 
Contraposition for the supplement follows by duality. Therefore, although 2- 


negation algebras are interesting, natural and simple algebras, they do not seem to 
be the right kind of structures to represent predicate term negation, 


18.3. Representing Negations as Sentential Operations 


‘This section shows that predicate term negation can be represented as a sentential 
‘negation that has an independent motivation, namely as strong, constructive negation. 
“Moreover, this section considers various suggestions for defining and classifying, 
notions of sentential negation that have been made in the literature, 


183.1. Negation as falsity 


‘There are several distinct justificatory roads to negation in the sense of definite 
falsity; see, for instance, Pearce (1991). One of these roads is provided by Kripke 
frames interpreted in terms of information states (or pieces) partially ordered by a 
telation of ‘possible expansion of information states.” The basic idea is that an 
information state may not only support the truth of certain atomic formulas but aso 
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support the fisity of cerain atomic formulas; see Gurevich (1977), Lépez-Escobar 
(1972), Routley (1974), and Thomason (1969). In other words, the idea isto treat 
verification and falsification on a par as equally important primitive semantic rela- 
tions. These considerations give rise to the notion of a Nelson model. A Nelion 
model isa structure (I, C, »”, ¥), where (I, C) is a partially ordered set and both »” 
and ¥ are valuation functions assigning to every propositional variable p a subset of 
I. Intuitively, »° sends atoms to the information states at which they are verified, 
whereas ¥* sends atoms to the information states at which they are falsified. More- 
‘over, it is required that for every propositional variable p and every f, 4 € I: 


Persistence’ if + 1, then £€ ¥'(p) implies w © ¥(p) 
Persistence” if ¢C m, then + € ¥(p) implies w € ¥(p) 


Let (= (I, C, 9", ¥) be a Nelson model, #€ T and A a formula in the language with 
strong, negation ~, intuitionistic implication >,, conjunction , and disjunction v 
over the denumerable set Atom of propositional variables. The notions 94, +" A (A 
is verified at ¢in 94) and 9%, ¢F* A (A is falsified at ¢in 44) are inductively defined as 
follows: 


Mt p iff FE (p), PE Atom 
Mote p iff 2p), pe Atom 
BFE BAC iff a6 re Band mM, 8° C 


Mt BAC M, t& Bor M, te C 


a ee BV C MG #E Band a6, 8 C 


MEY B,C iff (WET) if eC m, then 94, w 





if 
iff 

MG ee BV Cif at Bor M4 re C 
it 
iff Bimplies 6, w=" C 
i 


MG te B,C iff at eb Band a6, 2 C 


MG te~B if Gre B 
Mte~ Biff Gee B 


With this definition, verification and falsification of arbitrary formulas are persistent 
with respect to C. Semantic consequence is defined as follows: I Fy, A iff for every 
Nelson model a¢= (I, E, », ¥) and every #€ 1, if M6, r** B for every BET, then 
4, £* A. Nelson's propositional logic N4 is the theory of the class of all Nelson 
‘models in the given language. N4 conservatively extends positive propositional logic, 
the positive part of intuitionistic logic. [See chapter 11.] N4 can be axiomatized by 
adding to an axiomatization of positive logic the following axiom schemata: 
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AL ~~AwA 
Ad ~(AnBy=(~Av~B) 
AB ~(AVB)=(~An~B) 
AA ~(AD,B=(An~B) 


where A= Bis defined as (A >, B) «(B2, A). 

Contraposition as a nile does not hold in N4.” Moreover, provable equivalence 
fails to be a congruence relation on the set of formulas. If formulas A and B are 
defined as being strongly quiralens iff both A, B and their strong negations ~A, ~B 
are provably interderivable, then it can be shown that strong equivalence is a con: 
sgruence relation in N4, ic. there is a replacement theorem for strongly equivalent 
formulas. N4 is a system of paraconsistent logic, because not for every B, (A, ~A] 
a B Moreover, Nétis a system of four-valued logic:* Every pair (¥", °) of valuations 
induces a valuation ¥: Ix Atom — (1, 0, 2, {1 0}] by defining: 


My p)=l FE Hp) 

Wa p20 PE vip) 

Wt, p)=(1, 0) iff (FE Hp) and rE (2) 

Wh p)=@ — i (4 vp) and 6 € #(p)) 
‘The model 4 (L, G, ¥,¥) and the induced model C= (I, G, ») validate the same 
formulas of the language under consideration, if for every p € Atom, we define 

MEE p iM (ep) 

M8 pitt 4, p)=0 
‘The three-valued logic N3 is the theory of the class of all Nelson models (1, C, 
¥, #), where for every atom p, ¥(p) 9 ¥(p) is empty. N3 can be axiomatized by 
adding to an axiomatization of NA the ex contradictione schema ~AD, (A>, B). 


‘Strong. negation in Nelson's logics N3 and N4 is also referred to as constructive 
‘negation, since in both systems negation satisfies 
Constructible falsity +~ (A n B) iff (+~ A or +~ B) 
In N3 the contradictory forming intuitionistic negation — can be defined using 
the primitive, contrary forming strong negation ~ and intuitionistic implication ,: 
“A iff AD, ~A 
1N3 provides a natural example of a logical system with two kinds of negation 
suitable for representing both predicate denial and predicate term negation.” 
Considerations on the so-called Brouwer-Heyting-Kolmogorov (BHK) inter 


pretation of the logical operations in terms of direct (or canonical) proofs (alias 
constructions) [see chapter 11] may also lead to Nelson’s systems. According to this 
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interpretation, a canonical proof of (AD, B), for instance, is a construction that 
applied to a proof of A results in a proof of B. Lépez-Escobar (1972) suggested 
supplementing the BHK interpretation by the notion of (canonical) disproof. He 
tives the following disproof interpretation of the intuitionistic connectives 4, v, and 
, and the strong negation ~ (notation adjusted): 


(i) The construction ¢ refutes A.» Biff cis of the form (i, d) with i either 0 oF 1 
and if i=0, then d refutes A and if i= 1 then d refutes B. 

(ii) The construction ¢ refutes A'v Biff cis of the form (d, ¢) and d refutes A and 
¢ refutes B. 

(iil) The construction ¢ refutes A >, Biff cis of the form (4, ¢) and d proves A and 
« refutes B. 

(iv) ‘The construction ¢ refutes ~A iff ¢ proves A. 


Under the proof and disproof interpretation, a proof of ~A is not conceived of as a 
proof of AD, 4 (°A implies absurdity’), but rather as a refutation of A. This is a 
completely natural and direct way of relating proofs and disproofs by means of 
negation, Lopea-Escobar uses the following notion of provable sequent, with expect 
to which Nelson's logic N4& emerges as sound: 


(Ay... As} A is valid iff there exists a construction x such that (6... 6) 
proves A, whenever ¢,..., & are constructions proving 4,,..., A, (if =n). 





A sequent @ -+ A is said to be valid iff a construction exists that proves A. More- 
‘over, Lopez-Escobar assumes that no construction both proves and disproves the 
same A. Note that |A, ~A] > B is valid under the stronger assumption that no 
formula A is both provable and disprovable."° 

Still another idea that naturally leads to Nelson's system N4 is the idea of atomi- 
city of strong negation. If the provability conditions of a compound formula depend 
‘on the provability conditions of its components, then also the refutabilty conditions 
of a compound formula ought to depend on the refutability conditions of its com- 
ponents. But then, if negation ~ is to express falsity in the sense of refutability, 
‘one would need a negation normal form theorem to the effect that every negated 
formula ~A is provably equivalent to a formula mf(A) in which every occurrence of 
~ stands immediately in froat of an atomic formula, If, moreover, all literals (i.e. 
atoms and negated atoms) are to be treated on a par, because the refutability and 
the provability conditions of atomic formulas are independent of each other, then 
the result A” of replacing every occurrence of a negated atom ~p in mf(A) by a fresh 
atom ff ought to leave the derivability relation unaffected. In Nelson's logics N3 
and N4, the wanted negation normal form theorem is obvious from the axiom 
schemata Al-A4. Moreover, by induction on proofs in N4, one can show that in 
fact atomicity of negation holds: 


ThA iff Phy AY 


where I 





B | Ber} 





au 





Negation 


This observation also leads to a simple derivation of the completeness of N4 with 
respect to the class of all Nelson models from the completeness of positive logic with 
respect t0 the class of all intuitionistic Kripke models [see chapter 11]; ef. Gurevich 
(1977), Pearce (1991), or Rautenberg (1979)."" Suppose that, in addition to the 





‘on Atom U Atom’ is a structure (I, C, »), where (L, £) is a partially ordered set 
and y isa valuation function mapping every propositional variable in Arom U Atom 
to a subset of I. Moreover, persistence of atomic formulas is required: for every 
PE Atom Atom! and every f w ET such that £m, 1 x9) implies w€ x9). 
‘The notion  # A (A is verified at ¢ in 30) is inductively defined as follows: 


Mt if 5 Hp), pE Atom U Atom 

MtEBAC iff a6, th Band a6 eC 

Mek BYC if 9, tt Bor ag teC 

M,tBD,C iff (WHEN) if #C m, then a6, wt B implies a4, w! C 
Given such a model a¢= (I, ©, »), one can define a Nelson model a = (I, G9, #) 
by stipulating that for every p € Atom U Atom, 

vipb="(p) and wp) = 4p") 
Lemma For every (formula A, 


(i) AC, FF Aff ag ee AY, and 
(il) Mt, cE A fh MG, ee (~AY". 


Proof By simultaneous induction on the complexity of A. Consider only one case 
for the first claim, namely A= ~(B2, C): 
ae, #8" ~(BD,C) 
iff at, ee Band at, re 


iff [96, + BY (by the induction hypothesis for (i)) 
and 4 # (~C)* (by the induction hypothesis for (ii))] 


if 4 8 (BYA(~O)) [=(~1B2,0)1 


Proposition Né is strongly complete with respect to the class of all Nelson 
models. 


Proof Suppose F'¥ A. Due to atomicity of negation in N4, I ¥ A’. Since positive 
logic is strongly complete with respect to the class of all intuitionistic Kripke models, 
there is such a model af (I, C, ») and r€ I such that for every BET, 94 BY and 
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3 ¢¥ A*. The first part of the previous lemma guarantees that there is a Nelson 
model 9 = (1, C, », 7) such that for every BET, 9, ¢° Band 907, +" A. QED 

AA general definition of negation as falsity that is meant to encompass both intui- 
tionistic negation and strong negation is suggested in Wansing (1999), Suppose that 
a single-conclusion consequence relation — over a formal language containing a 
‘unary connective * is given. In other words, for all formulas A, B and all finite sets 
of formulas 4, T° 

Reflexivity + AA 

Monotonicity T+ AFT UB) >A 

Cut TULA] 9 RASAHTUA SB 
‘A binary relation @ between finite sets of formulas and single formulas is called 
a single-conclusion *-refutation relation iff forall formulas A, Band finite sets A, T of 
formulas: 

sorflexivity AC AFAR A 

sent SEATU Ale BRAUrE 


‘Assume that > and ¢ are given as sequent calculi. If + is a single conclusion 
consequence relation, then * is a negation as falsity in —» iff 


(a) the relation « defined by “AeA iff A> +A? is a single-conclusion 
‘refutation relation 


(B)_ for every formula A, not both +O» A and + D> +A 
(7) there is a formula A such that not both + A+. 





ey 


If & isa single conclusion *-refutation relation, then * is a negation as falsity in 
iff 


(@”) the relation + defined by ‘A+ A iff A.€ +” isa single-conclusion conse- 
‘quence relation 


(B') for every formula A, not both +e Aand HO +A 
(7°) there isa formula A such that not AG A. 


If + satisfies both (qt) and (@’) for a single-conclusion consequence relation —> and a 
single-conclusion *-refutation relation ¢-, then negation as falsity is a vehicle for 
cither keeping —> and dispensing with < or keeping and dispensing with —» 
‘Then both double negation introduction A+ **A and double negation elimination 
++A_— A are derivable. Clearly, the relation defined by (@) is a single-conclusion 
*-refitation relation iff * satisfies + A ++, and double negation introduction and 
analogues of (B) and (y/) are satisfied by strong negation in N3 and N4. 


426 








Negation 


‘Tennant (1999) also motivates negation in his system of intuitionistic relevant 
Jogic by considerations on both proofs and disproofs. However, whereas in Tennant’s 
natural deduction system there are direct proofs, there are no direct disproofs of 
‘compound formulas. In this system there are no derivations not merely revealing the 
inconsistency of a premise set, but rather leading to the conclusion that a certain 
compound formula is refutable. A discussion of Tennant’s approach can be found in 
Wansing (1999). 


18.3.2. Negation as inconsistency 


Gabbay (1988) defines a syntactic notion of negation as inconsistency. Suppose again 
that a single-conclusion consequence relation —+ over a formal language containing, 
4 unary connective * is given. The basic idea of Gabbay’s definition is that the 
negation * of a formula A is derivable from a set of premises Tiff some undesirable 
formula B from a set of unwanted formulas 6* is derivable from T together with A. 
It is assumed that the logical object language either already contains or is conserva 
tively extendible by a counterpart of the set-theoretical combination of premises. 
‘This counterpart is conjunction, 0, governed by the following introduction rules: 


(Ga) PHAPo BHP (AAB) 
(a4) PULA, BIO CHPU[(AA BIC 


‘The unary operation * is then said to be a negation (as inconsistency) in —» iff there 
is a non-empty set 6" of formulas which is not the same as the set of all formulas 
such that for every finite set of formulas and every formula A: 


FP eA iff GRE O) (HULA) +B) 


Moreover, @* must not contain any theorems. If such a collection of unwanted 
formulas exist, it can always be chosen as (C|+ @ -» #C, since by (reflexivity), the 
latter set is non-empty, if * is a negation. The definition of negation as inconsistency 
can therefore be reformulated without appeal to @*. Namely, * is a negation as 
inconsistency in ~ iff for every finite set T- of formulas and every formula A: 


HPs iff ICQ >*CK TULA] 90) 


Negation in quite a few familiar logical systems can be shown to be a negation as 
inconsistency. In minimal, intuitionistic, and classical logic, for example, * can be 
identified with the sex of all explicit contradictions (A a *A) in the respective lan- 
‘guage. In Gabbay and Wansing (1996) the notion of negation as inconsistency (alias 
inferential negation) is extended to a type of nonmonotonic inference relations 
between structured databases called structured consequence relations 

very negation as inconsistency satisfies contraposition as a rule in the form: 


TU [Al Bimplics FU [*B} > +4 
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Suppose I'U [A] > B. Since *B-+*B, by definition there is a C€ @* such that 
{*B, B} — C. Applying (cut) one obtains TU (*B, Aj -» Cand hence TU {+B} > +A. 
‘Therefore, strong. negation in Nelson’s constructive logics fails to be a negation as 
inconsistency. Also, every negation as inconsistency validates the Law of Excluded 
Contradiction, *(+4 » A). Since *A— +A, we have |*A, A} B, for some BE 0”. 
Hence, +A.» A B, for some BE 6", and therefore @— *(*A.» A). Also double 
negation introduction + A+ **A is provable: 


A,*A-»(AA*A) contraposition 
DrlAntA) A AAtA) > eA (cut) 
AeA 


“Moreover, we have | ###A > #A: 


A= 2A (reflexivity) 
+A, A) > B reformulated definition of negation 
+B Al > **A contraposition 
Q>+B [+B Al =A (cut) 
‘AeA contraposition 
AeA 


for some B such that + @ > *B. In the inverse direction we have: 
tAseA 


24, 2#Al > B 
+B +A] ee8d 





tA eed 


for some B such that + > +B. 


Observation Suppose —+ is consistent in the sense that for no formula A of the 
underlying language, both @ + A and @ -+ +A are provable. Then * is a negation 
as inconsistency iff * satisfies contraposition as a mule, the Law of Excluded 
Contradiction, and double negation introduction. 





Proof It has already been shown that the direction from right to left holds. For 
the direction from left to right suppose that "+ +A. Then + AP—+ +A and by 
contraposition, + +*A—> +A. By the structural rules and the rules for a, +T, 
4A AP AAD, and since A **A, one may apply (cut) to obtain HT, 
A> +AT AAP. Thus one may choose as @ the set of all explicit contradictions 
AA *A. This set 6° does not contain any theorems, because otherwise + @-> C and 
+ @-¥*C, for all formulas C. Suppose now that #T, A+(Ca*C). Then, by 
contraposition, + P, «(Ca *C) + *A, and since + > (Ca *C), an application of 
(cut) gives FTA. QED 
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Obviously, not every recognized negation is a negation as inconsistency. Strong, 
negation in Kleene’s thrce-valued logic [see chapter 14], for example, fails to be a 
negation as inconsistency, because duc to the absence of any theorems, the Law of 
Excluded Contradiction fails. The notion of negation as falsity is also non-trivial. Io 
‘normal modal logic, for instance, necessity (fils to be a negation as falsity, because 
+ (py) and F CXp v =p) in contradiction of (f). In Wansing (1999), it is shown 
that every negation as inconsistency is a negation as falsity. 


Observation Every negation as inconsistency is negation as falsity. 


Proof It must be shown that the defined relation < satisfies (+-reflexivity), (+-cut), 
(B), and (7). 

(*-teflexivity): *A@ A is immediate from *A—>*A and the definition of ©. 
Since D> #(+A.».A) and |A, +A] ++A.*.A, there is a C such that @ > #C and 
{A} U [+A] ~ G, hence A+ **A and thus, by the definition of -, AC +A, 

(s-cut): The (*-cut)-rule follows from the definition of « and (cut) for ~>. 
Assume A *A and PU [A © B. This means AA and TU {A} —>*B, An 
application of (cut) gives A UT'—+ +B, which means 4 UI'& B, as required, 

(B): Suppose that both @ > A and @ + *A for some A. Then there is a BE O* 
such that A—» B. However, by (cut), >, that i, @* contains a theorem, quod non. 

(1): Suppose that for every formula A, A-+*A and *A—> A. Then there is 
a BE @* such that A—> B, that is @ + +A. Applying (cut) to the latter and #A —» A, 
gives @—+ A, But then, since 6 is non-empty, it must contain a theorem; a 
contradiction, QED 











18.3.3. Negation as orthogonality 


‘The interpretation of negation by means of the notion of orthogonality (or in- 
compatibility) turns out to be very illuminating, because it enables a classification of 
different concepts of negation in terms of correlations between negation laws and 
algebraic as well as relational properties together with conditions on valuations. This 
semantic classification has been investigated in a series of papers by Dunn (1993, 
1996, 1999). 

Consider a partially ordered set (A, =, 0, 1) with bounds 0, 1. From an algebraic 
point of view, the elements of such a set may be seen as propositions. Intuitively, 
xy may be understood as ‘x provably implies 3.” In addition to = one may 
consider another binary relation Ton A and think of x1y as ‘xis incompatible with y. 
Ics then natural to define two negations —~ and ~ a8 follows: 


xs-y if xly 
xS-y if ple 


Obviously, both notions of negation coincide if the not implausible assumption is 
‘made that incompatibility is a symmetric relation. By imposing constraints on =, —, 
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and -, various concepts of negation can be defined. Interestingly, the defining 
properties can be correlated with relational properties of an incompatibility relation 
in so-called perp models (together with conditions on valuations). A perp fame is a 
structure (I, C, ), where the incompatibility relation 4. satisfies 

(6.4 wand £6 £') implies 2° 4 

(6.4 wand ww) implies #4 9 
A perp model is a structure = (J, v), where Fis a perp frame, and » isa valuation 
function persistent with respect to the relation C. If X is a subset of I, let 

FLX if for every wEX, tw 

XL iff for every we X wie 
One may then define ‘Xs #1 X} and X!=(¢] XA el, Negated formulas are 
evaluated according to the following clauses: 

MtRoA if FE 4A] 

Mtb-A iff FE [Al 
where [A] = [11] 96 #F A}. With these definitions, negated formulas are also per- 
sistent with respect £0 ©. 


Dunn (1993, 1996) has shown that posets with bounds satisfying the properties 
listed below can be represented by perp frames satisfying the associated conditions:'? 


Negation —_—_Posets Perp models 

Subminimal x= y=9-ys—w [A= [4)° 
xsyays-x [-Al=¢|4] 

Galois xs nyeoy sn [n4}=141', -4}= 14] 

Minimal xs-yeys-w ‘L symmmensic 

Intuitionistic Minimal + 4 irreflexive, symmetric 
(xs y&xs-y)ouse 

DeMorgan Minimal + ——w 5 x [ApS = JA], 4 symmetric 

Ortho Intuitionistic +s [AI*= [Aj 1 ireefl, symm. 


‘The conditions on posets express inferential properties of negation. The negation 
logics defined by these properties are sound and, in view of Dunn's representation 
theorems, also complete with respect to the classes of perp models satisfying, the 
associated conditions. Instead of considering partially ordered sets with bounds 
fone may, of course, also study other algebraic structures, also for a logical object 
language richer than |=, -I; see, for example, Dunn (1996). Since every subminimal 
negation satisfies the contraposition rule, however, extensions of subminimal negation 
like the negations investigated by La Palme Reyes etal. are unsuitable for represeat- 
ing predicate term negation. 
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18.3.4. Perfect negation 


According to Avron (1999), a unary connective of a logic A is a perfect negation if 
it enjoys a certain syntactic property and, in addition, A is strongly normal in a 
certain sense, To state the syntactic property, first, various definitions are needed. 

‘Suppose A is presented asa single-conclusion sequent calculus (defined for premise 
‘ulti-sets not necessarily satisfying monotonicity). Then its associated internal con- 
sequence relation >is defined by 


Aye AA HE BAe AA 
A's associated external consequence relation + is defined by: 


Aye Ag PA i PAL PA A 





‘A.unary connective + is sid to be an insernal negation fora consequence relation —> 
iff the relation ~ is closed under 


ATPOAHPAtA and TA, Ab OAT 8 


‘The existence of an internal negation forces a consequence relation to be a multiple- 
conclusion relation, A single-conclusion consequence relation —+ over a language 
‘with a unary connective * is said to be stronaly symmetric with respect to * iff there 
‘exists a multiple-conclusion consequence relation -»" defined over the same language 
such that 


PA if PA 


and * is an intemal negation for +". 

IF A is presented as a single-conchusion sequent calculus over a logical language 
containing a unary operation *, then, according to Avron, * is a perfect negation 
from the smtactc point of view if the intemal consequence relation associated with A 
is strongly symmetric with respect to *. Intuitionistic negation and Nelson’s strong, 
negation fil to be perfect in this sense because every internal negation satisfies 
double-negation elimination and the contraposition rule. Indeed, Avron (1999, 
thm 4) shows that if -> is any consequence relation, then it is strongly symmetric 
with respect to * iff 


“ 
(ii) 
(it), A B implies 1, +B +A. 





If the requirement of strong symmetry is relaxed in a certain way, strong negation 
still emerges as perfect from the syntactic point of view in a sense. Suppose ~ is a 
consequence relation defined over a language containing the unary operation +. 





431 





Heinrich Wansing 


Define the multiple-conclusion consequence relation >" by requiring that Ay, .., 
A,B... Beiff forall 15 is wand 1s js 





Rees A Sees. Wp Reema eH 
and 

Agcy Ay *Biyoooy *Beay *Bptys 25 By By 
‘Then (Avron, 1999, propn 9) 


(i) PA implies PA 
(i) Q- Ai OA, and 
(iii) is a conservative extension of —> iff F, A— B implies T, + B—> +A, 


fone tries to see * a8 a negation in —», then, in Avron’s view, is induced by — 
in a natural way. The operation * in -» is defined to be weakly symmetric if itis an 
internal negation of ->, As Avron observes, * is weakly symmetric in —> if A> **A 
and #*A > A, and these conditions are satisfied by strong negation in Nelson's N3 
and N& 

‘A unary operation * in a logical system A is a perfect negation from the semantical 
point of view if A is strongly normal. For Avon (1999, p. 15), “a semantics is, 
‘essentially, just a set Sof theories,” since “the essence of a ‘mode’ is given by the set 
of sentences which are true in it.” A unary connective * is a (strong) semantic 
negation if in terms of validity in a model it reflects that every formula is either true 
(or not true in a model and not both. Recall that a theory Tis said to be consistent 
if there is no formula A such that both A and its negation are derivable from T: A 
theory T'is complete if for every formula A, either A or its negation can be derived 
from T. Ifa theory is both consistent and complete, itis said to be mermad, Assuming 
that A is given by a single conclusion consequence relation, that the underlying 
language contains a unary operation + (considered to be a negation), and that for no 
formula A, both A and *A are provable, Avron presents various characterizations of 
‘A (1999, propn 26): 


+ Nis strongly complete iff whenever TV A there is a complete extension T’ of T 
such that T”¥ A. 

‘© Nis weakly complete iff whenever © ¥ A there is a complete theory 7” such that 
TVA 

© A is strongly normal iff whenever T¥ A there is a complete and consistent 
extension T” of T such that T’¥ A. 

Ais weakly normal iff whenever © ¥ A there is a complete and consistent theory 
T such that T’¥ A. 

+ Ais cnormal iff every consistent theory in A has a complete and consistent 
extension. 

Ais strongly enormal iff whenever Tis consistent and T¥ A there is a complete 
and consistent extension T’ of T such that T’¥ A. 
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Avton (1999, propn 27) observes that if A is finitary, then itis strongly complete iff 
for every theory T and all formulas A, B: 


T,AtyB and T,+At,B implies Th, B 


and that this condition is equivalent with the provabilty of A A, if the underly- 
ing language contains a disjunction operation v such that T, Av Bt, C iff both 
T, A, Cand T, By C. Therefore, intuitionistic negation and any representation of 
predicate term negation as a unary connective are bound to be imperfect in Avron's 
sense. Indeed, the only negation among the unary connectives considered by Avron 
that emerges as perfect from both the syntactic and the semantic point of view is the 
Boolean negation of classical logic. Whereas intuitionistic logic and Kleene’s three: 
valued logic are still strongly consistent and c-normal, and N3 is still strongly 
consistent, N4 enjoys none of the listed propertics. 


18.4. Epilogue 


Although there is no general agreement on what is negation, whether it is a unary 
connective of an innersentential operation, whether contraposition as a rule holds 
for it or not, etc, it is accepted knowledge that there is more than one kind of 
negation, be it in the same syntactic type or in different categories. This insight to a 
large extent rests on the Aristotelian distinction between predicate denial and pre- 
dicate term negation. Moreover, whereas the contradictory forming predicate denial 
can be represented by a contradictory forming sentential negation, the contrary 
forming predicate term negation can be represented by the strong negation in 
many-valued logics such as Nelson’s constructive systems N3 or N4. 

‘The classification of kinds of sentential negation may be approached from various 
points of departure. One idea is to define a general non-trivial notion of negation in 
a system, meant to cover all recognized negations. This is ~ quite explicitly ~ the 
intention behind Gabbay’s (1988) definition of negation in a system. The notion of 
negation as falsity generalizes Gabbay’s suggestion; other definitions or additional 
requirements may lead to more restricted notions of negation. A clasification of 
negation may, however, also be seen primarily as a means for identifjing non- 
negations. Avron (1999, p. 21), for example, puts his criteria to such a use when he 
concludes that “[t]he negation of intuitionistic logic is not really a negation.” A 
third aspect of a classificatory scheme is that it may help to recognize the inter- 
relations between the items clasified. In this respect, the semantic clasifications of 
Dunn (1993, 1996, 1999) tum out to be particularly useful 


Suggested further reading 


‘The classical reference as far as linguistic and philosophical aspects of negation are concerned 
is Hom’s encyclopedic monograph (1989). This book also contains a careful introduction to 
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Aristotelian tem logic. Another important reference to negation ia Aristotelian term logics 
Englebresen (1981), More recent volumes devoted entirely to the study of negation are 
‘Wansing (1996) and Gablay and Wansing (1999). The laner collection of research papers 
sims at providing a comprehensive account of negation from 3 logical, computational and 
Philosophical poine of view. A discussion of the notion of cgation as Gite fallre to derive 
«an be found in Fine (1989). The relation between monotonicity propertics of natura lan- 
(sug: expresions and negative polarity items lke “anything” is desk with, for example, in 
Zot (1996). 


Notes 


11 shall not even ey to survey this teratre. Nor shall 1 deal with philosophically interest: 
ing negation related themes suchas negaane custenals, presupposition, oF paraconssteney 
Nothing at all will be suid about the pragmatics ot about psychological aspects of 
negation. Instead, the emphasis ofthis chapter will be on characterizing and casing 
various notions of negation 

2 Horn (1989, p. 38) remarks tat *[tJhe unique polar contrary and the unique immediate 
contrary ofa given term will not in general coincide.” 

3 Von Wright (1989, p.§) criticizes Artotle for his wilingnes to all both ‘John is well” 
and ‘John i il fale, if the name ‘John’ does not denote anything: 


‘Om this pot Adstotle eight be accused of obscuring 4 distinction which in ocher contexts 
‘be marks. f mean the ditineson between the case when “x is P™ i not ese because x dock 
not exist of becane P cannot be “aaturly predicated” of it, amd the cave when “sis Pi 
not tue because not-P can be (truly) predicated of it. A convenient way of marking this 
itnction would be to say that "xis P* fae eal i the ease when “x is noe P” is 





4 There are exceptions such a “undesidable’ meaning “not decidable 

5 In the sequc I shall nor akways pay attention ro the mention/use distinction and omit 
quotation marks if misunderstandings are unlikely to occur. 

6 From x yone obtains x4 —y = yy, and since yy 0, and 0 is the Kast element, 
(18.1) implies ys 

7 Contraposible strong negation is sadied in Nelson (1958). 

8A comprehensive study of four-valued logic, including Nelwoa's N4, can be found in 
Dunn (2000); ee also Belnap (1977) 

9 Other application aeas in which the need for logics with more than one kind of negation 
ares include databuse theory, logic programming, and nonmonotonic reasoning: s€e, 
for instance, Wagner (1994), and Wansing (1995). 

10 A'more comprehensive dicussion ofthe proof and daproo interpretation can be found 
in Wansing (1993). 

11. An algebraic analysis of Nelson's logics can be found in Rasiowa (1974); axiomatic 
extensions of N3 are investigated, for example, in Kracht (1998). 

12 For DeMorgan and ortho negation we assume symametry of 1. Hence =~ and 
We have MG FA i for every we T,£2 w implies a6 w¥ A. Since an inference At B 
is defined to be valid in a perp model = (I, Cy») iff for every 1€ 1, a6 + A implies 
1, 1 B the validity of double negation climiaation in amounts to the validity of 
DOA D'A in the modal model (I. £, »). The lamer formula is modaly equivalent 
‘wth the Sablgys formula A’ OCIA, and hence is int-onder definable. Ie comesponds to 
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Negation 
(DNE) We3u(tLu a YauLs> F=5)) 


From the modal point of view, double negation introduction expresses the axiom schema 
AD DOA, which is known to correspond with the symmetry ofthe accessibility elation. 
[Note that in the case of (DNE) we have shown correspondence but not completeness 
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Quantifiers 
Dag Westerstahl 


19.1, Routes to Quantifiers 


‘There are two main routes to a concept of (generalized) quantifier, The first starts 
from first-order logic, FO, and generalizes from the familiar V and 3 occurring 
there, The second route begins with real languages, and notes that many so-called 
‘nown phrases, a kind of phrase which occurs abundantly in most languages, can be 
interpreted in a natural and uniform way using quantifiers, 

‘This chapter takes the firs route. One reason is that it leads most directly to a 
‘most general notion of a quantifier, subsuming those one finds in natural languages, 
“Another reason is that FO is so familiar, and in any case is presented in chapter 1 of 
this book. Indeed, acquaintance with that chapter is assumed and (with few excep- 
tions) the notation introduced there is used here. At the end of this chapter, I will 
indicate what quantifiers have to do with natural languages. 

‘The actual historical development of the concept of a quantifier is slightly com- 
plicated. The expressions ‘all’, ‘some’, ‘no’, ‘not all’ from Aristotelian syllogistics are 
readily seen as (generalized) quantifiers of type (I, 1): they are definable from V and 
3 but not the same as these; all of this will be explained shortly. Frege, who, if 
anyone, must be regarded as the inventor of FO, actually had in his possession 
essentially the concept of a generalized quantifier that will be encountered here (the 
main difference being that he quantified over a fixed universe of aif objects, whereas, 
here, quantifiers are relatvized to arbitrary (sub)universes). However, since he could 
‘express all the mathematics he needed with Vand 3, he was content to have only 
these (in fact only V) in his Begriffschrift. Much later, when first Mostowski (1957) 
and then Lindstrém (1966) introduced generalized quantifiers into mathematical 
logic, opening up the study of so-called model-theoretic logics, they were apparently 
unaware of Frege’s notion. Later stil, linguists noted the relevance of quantifiers to 
natural languages, for example, Barwise and Cooper (1981) and Keenan and Stavi 
(1986). They found, of course, that the four Aristotelian quantifiers were prime 
‘examples of ‘natural language quantifiers’, but also that there were many more, not 
definable from these. 
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1 will not dwell farther on history here; the interested reader can find more 
in Westerstih! (1989), which also presents the logical and linguistic properties of 
quantifiers in much more detail. Another survey, emphasizing the link to natural 
languages, is Keenan and Westerstihl (1997). 


19.2, First-Order Logic Revisited 


From chapter 1, first recall that a first-order language has a signature o which is a 
set of non-logical symbols: relations symbols P, R,.... of various arities, function 
symbols F, G,... of various arities, and individual constants « d,.... A structure 
(or model) for 6, oF simply a o-structure, consists of a universe A and an appropriate 
interpretation -* of the symbols in 9, so that if P is an nary relation symbol in 
@ then P*is an nary relation over A; if Fis an n-ary function symbol in 6 then F* 
is an mary operation on A; and if ¢ is an individual constant in 6 then ¢ is an 
element of A. So one may write 








AAPA RS. FAG. ed.) 


(Note that here A, B,.. are sed for structures where chapter 1 uses J... instead, 
and moreover that here, to save notation, the very same letters are fen used forthe 
universes of those structures.) 

‘The signature and is symbols can often be left implicit. For example, if one writes 


NE(N, < 55,0) 


Where N= 0, 1,2,... pit is understood that this is a structure for a signature with 
‘one binary relation symbol denoting the usual order of the natural numbers, two 
binary function symbols denoting addition and multiplication respectively, one unary 
function symbol denoting the successor operation, and one individual constant 
denoting 0. In fact, one often uses *<’, “+, ete, for both the symbols and their 
denotations in such a case. A structure is called relational if its signature contains 
only relation symbols. 
Now, the fundamental relation 


Ab vim... 





as.) 


means that y is true in A under a valuation » such that »(x)= a, for 1 im, 
where Y= Yls),....) is a G-formula with at most x,..., free, Aisa 
structure, and a... a € A. When y is a sentence, ic, a formula without free 
variables, A y is often read “ys truc in A,’ or ‘A is a model of y.” 

A sequence (a,,...,4,) may be abbreviated as a. Then, with an obvious extension 
(cr, if you will, abuse) of the above notation, one can write the standard explications 
of the universal and existential quantifiers as follows, where @= 9 5...) 
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AE (Va)gls, a] iff for every @€ A, AF fa, a] (19.2) 


At @sjgls,a] iff for some a€ A, At gla, a] (19.3) 


19.3. What Do Quantifier Symbols Denote? 


Equations (19.2) and (19.3) state what *V" and ‘3° mean, but they do so in an indirect 
‘way: they do not state what, if anything, they denote. On the other hand, the structure 
A does say what the symbols in ¢ denote: P denotes P4, etc. With a medieval term, 
the o-symbols are given categorematically, whereas VW and 3 are defined syncategor- 
ematically. Can one give a categorematic definition of the quantifiers? 

This has been a vexed question in the history of logic. Informally, one might try 
to think of something like ‘a man’ denoting some particular man. What then about 
‘every man’ - it would seem to have to denote the set of all men. But matters 
‘worsen if one considers ‘no men’; does this denote the empty set? If 0, it has the 
same denotation as ‘no dogs’ ~ this seems wrong. Considerations like these may lead. 
‘one to suppose that there is no coherent and uniform way of assigning denotations 
to quantified phrases. But in fact there is, and the theory of generalized quantifiers 
provides the solution, 

Consider first the corresponding question for the propositional operators, say, 
conjunction. Everyone knows that ‘& can be taken to denote a binary truth fiume- 
tion, The corresponding clause in the usual truth definition does not mention this 
truth function explicitly, however; it is sill syncategorematic: 





Ab (@& vila] iff AF gla] and AP yla] «a94) 


To reformulate this, begin by noting that in a structure A, a formula with & free 
variables denotes a kary relation over A: the set of Astuples of elements of A 
satisfying the formula. Thus define, for any formula @= 9{x,,...,%,), any 
structure A, and any n-tuple a of elements in A, 


IT if Aveta] 
JF otherwise 


(19.5) 





‘Then (19.4) can be rewritten as 
Ar (p& yila) iff & (9%, y4)=T 


(or even more compactly as (9 & y)"*=& (@*, y*)), where the last “&’ denotes 
the truth function given by the usual truth table for conjunction, 

To do something similar for ¥ and 3, extend the notation in (19.5) as follows 
Let A be a ostructure, @=9(x,%,.-..*,) 2 G-formula with at most the free 
variables shown, and (a;,...,)=a an w-tuple of elements of A. Then 
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o**=|a€ A: At ofa, all 

Ina structure A, a formula with one free variable denotes a set: the set of objects in 

A satisfying the formula. If g has additional free variables x;,...,x,, but these are 

interpreted 8 a,...,,, respectively, then, relative to this interpretation, @ still 

denotes a set, and this set is @*, Now (19.2) and (19.3) may be rewritten as 
AF(Vs)pLs a) iff 9*=A (19.6) 
Ak ugly a] if o*M*O «a9.7) 


Just one small further step is needed. Let, on each universe A, 3, be the set of 
non-empty subsets of A. And let ¥, be simply [A]. Then (19.6) and (19.7) become 


Ab(vxipls a] iff 9 EV, (19.8) 

Ak Guolx a] iff 9 E3, (19.9) 
‘That is, V and 3 may be thought of as denoting, on a universe A, a set of subsets 
of A. But then, any such set of subsets can be called a (generalized) quantifier 
on A. 

For example, suppose one wants a quantifier that says “there exist at least 
objects such that.” Introduce a symbol ‘3,,” and define, for each universe A, 

(Beale (XE AsIXLZ a] 
((X1 is the cardinality of X), Then the clause 

A Goex)olx a] if 9 E(B.) iff [pen (19.10) 
aves just what is wanted, 

‘The pattern is clear, and completely general. That is, a quantifier Q on Aisa 
set of subsets of A. “G can also be thought of as a new symbol, such that whenever 
is a formula, s0 is 

(Que 


(x) binds free occurrences of x in @ just as usual, and its meaning is given by the 
clause 


A¥(Quelsal if 9 EQ, 
Here are some more examples: 


BL=IXCA:|X]=a) agit) 
(there are exactly m objects such that’) 
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Q.=1XC A: Xis infinite} (19.12) 
(there are infinitely many objects such that’; the name °Q,"is standard and is due 
to the fact that the quantifier means ‘at least ') 


Qe=IXCA:|X\=1All (19.13) 
(the ‘Chang quantifier’; it means V on finite sets but net on infinite sets) 


Qu= {XC A:|X]>[A- XI} (9.4) 
(the ‘Rescher quantifier’; on finite sets it means ‘for more than half the elements 
Of the universe’) 


To see the use of such quantifier, here isa prime example of how one can express 
‘new things with them. Consider again the structure from section 19,2, It is a 
fact about this structure that every clement has a finite number of predecessors 
‘There is no way to express this in FO a proof ofthis will be given later. But using 
Qo, the sentence 


(Va) ~ (Quaky <a) 


says exactly this 


19.4, Monadic Quantifiers 


Having considered quantifiers which are sets of subsets on a universe A, it is natural 
to go further and consider relations between subsets of A. Itis here that one finds, 
to begin, the four Aristotelian quantifiers: 

‘allyXY iff XC Y (ie. fall X are Y, where X, YA) 

some,XY iff XV Y# 

no,XY iff XVY=O 

not all,XY iff XEY 


But there are many more binary relations between subsets of A, for example: 


UXY iff [X]=]Y| (9.15) 
(the Hirtig. quantifier) 

more,XY iff |X|>[¥] (19.16) 
most,XY iff |X ¥|>|X- YI (19.17) 


(on finite universes this means ‘more than half of the X are Y") 
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sat least m/n,X¥ iff [XO Y|= m/n-[X] (19.18) 
(0 < m <n; the properly proportional quantifiers ~ they only make sense if X is 
finite). 


‘These are just examples: if A has m elements, there are 2° subsets of A, and 2° 
binary relations between subsets of A. So over a universe with just two elements, 
there are 2! = 65536 such relations! 

‘The quantifiers from section 19.3 are called of sype (1), and those considered so 
far in this section are called of type (I, 1). One can go on to consider quantifiers of 
type (1, 1, 1), ie., ternary relations between subsets of the universe, for example, 


more than,XYZ_ iff {XOZ|>|YOZ] (19.19) 
(more Xs than Ys are Z) 


In general, a monadic quantifier of type (1... 1) on A (with & 1s) is a k-ary 
relation between subsets of A, for some k= 1. This terminology indicates that there 
are also polyadic quantifiers, for example of type (2, 1, 3), but these are left until 
section 19.11 

Finally, note that the meaning of a quantifier like some or most is not dependent 
‘on a particular universe; rather it associates with each universe a corresponding 
‘quantifier on that universe, So the general definition is as follows: 


‘A (monadic) quantifier of type (1,..-, 1) (with & 1s) is a function Q which 
associates with each universe A a quantifier Q., of type (1, ...,1) on A, in other 
words, a Kary relation between subsets of A. 


Such a quantifier Q can also be considered as a variable-binding operator, but now 
it operates on & formulas and binds one variable in each. That is; 


(Qeyn) If 9, ..5) are formulas, then (Qx\(G..... 4) is a formula (where 
all free occurrences of x in each 9, are bound by (Qs), 


‘whose meaning is given by the clause 


(Qsem) AF(QHe.---. eal iff (i, ne) EQ, 





19.5. Quantifiers and Quantities 


The quantifiers considered so far have an important feature: they deal only with 
‘quantities. By contrast, here is an example of a type (1) quantifier that does not deal 
with quantities. Let John be an individual and define 


(Qute)¢X iff John € X 
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‘That is, if John € A then (Qyaq), consists of all those subsets of A containing John; 
otherwise (Qha)4 is empty. This is not an unreasonable object (when John € A). 
In mathematics, it is called the principal fter (over A) generated by John. In 
linguistics, it has been used to interpret the proper name Jobm, But clearly, it says 
nothing about quantities. 

To explain this, the concept of isomorphism between structures is needed. Intui- 
tively, isomorphic structures ‘have the same structure’ and can for many purposes be 
identified. Let A and B be structures for the same signature 0, which, for simplicity, 
can be taken to be relational. An isomorpliim between A and Bis a bijection f from 
the universe A to the universe B (a one-one mapping from one onto the othe?) such 
that if Pis an mary relation symbol in o and #;,....a,© A, then 


(yoy) E P49 (fla)y--s fla PP 
This is written as f: As B, and As B says that A and B are isomorphic, 


there isan isomorphism between A and B. 
First-order logic cannot distinguish between isomorphic structures: 





(somorphiom closure) \f A B then every FO sentence which is true in Ais true 
in B, and vice versa 


(The converse of this is far from true in general, though it does hold for finite 
structures.) In fact, isomorphism closure is usually a requirement on any logic, as 
shall be seen, 

For the moment, however, I want to bring out the connection between isomor- 
phism and quantity. First, note that if Am B then |A|=|B], since the latter means 
by definition that there is a bijection between A and B. But in the special case of 
monadic structures, i.c., structures with only unary relations, more can be said. 
‘Consider a signature with two unary relation symbols. A structure A= (A, X, Y) for 
this signature partitions the universe into four parts as shown in figure 19.1 

Now if 





SlALX, Y) (A X,Y") 





A-(XUY) 











Figure 19.1 
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then the corresponding parts of the two structures must have the same cardinality. 
For the restriction of the bijection f to, say, X- Y becomes a bijection between 
X- ¥ and X’— Y", and similarly for the other parts. But the converse holds too: if 
the corresponding parts have the same cardinality, then there are four bijections, 
whose union is an isomorphism between the two structures. That is, 


Fact 19.1 (A, X,Y) = (4X, ¥) iff 
IxX-Y]=1x"- ¥1] 
Ixn yJ-1x'0 Y 
ly-x]=1Y"-x7] 


JA-(XU Y= ]4-0U YD 


‘This generalizes to all monadic structures: if there are k unary relation symbols in 
the signature, the universe is partitioned into 2* parts, and the number of elements 
in these is, up to isomorphism, all there is to say about the structure, 

Now consider the following property of a type (I, 1) quantifier Q: 


X,Y) then (Q.,X¥ € Q,X'Y'] 


This is what I mean by saying that Q deals only with quantities. If Q satisfies ISOM 
then, by fact 19.1, only the number of elements in X-¥, XY, ¥~X, and 
A~(XA Y) determines whether Q .XY holds or not. Now look at our examples 
of type (1, 1) quantifiers from section 19.4: each one is given by a condition on one 
‘or more of these quantities; hence they all satisfy ISOM. For example, 


(ISOM) IF (A, X, ¥) = ( 





all, X¥ «| X- ¥|=0 

some,X¥ 2X0 Y|>0 

‘moss, XY €3|X0 ¥|>|X~ ¥ 

‘more, XV € [X= ¥]+|XA ¥|>1¥-X]+|X0 YI] 

ete 

ISOM js expressed similarly for other monadic types. In the type (I) case, it 
says that whether Q .X holds or not is determined by the two quantities | X| and 


|A~ X|, Thus, all the type (1) quantifiers from section 19.3 satisfy ISOM, but the 
quantifier Q,.4 does not: one may have X, X" CA 


V1=1X1 


[4-X]=]4-X1] 
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but John € X— X" 
It should come as no surprise that there is a tight connection between ISOM and 
isomorphism closure. To state it, we must sharpen our idea of what a legic is. 


19.6, Logics with Generalized Quantifiers 


In chapter 1, first-order logic FO is characterized as a collection of (artificial) lan- 
‘guages: for each signature o there is the set of o-formulas, defined inductively, 
starting with the atomic formulas, and then one clause for each logical constant. 
Then, the relation F between a structure, a @-formula, and a valuation (of the 
variables in the universe of the structure) is defined with a corresponding induction, 
with (19.2) and (19.3) (or (19.8) and (19.9)) as the inductive clauses corresponding 
to V and 3. 

Now let Q be any (for the time being monadic) quantifier. The logic FO(Q) is 
ssiven, syntactically, by adding (Q-syn) (cf. section 19.4) as a defining clause of the 
‘@-formulas, and, semantically, (Q-sem) as a defining clause for ¥. Thus, FO(Q) has 
all the expressive machinery of first-order-logic, plus the quantifier Q. 

Similarly, we can define FO(Q,, ... . Q,), oF even FO(Q) where Q is any set of 
quantifiers. By a /agie 1 will mean a logic of this form (there are more general 
notions of a logic but they will not concern us here). 

For example, FO(Q,) is a logic, with atomic formulas, negations, conjunctions, 
existential and universal quantifications as usual, and formulas of the form 


(Qoxie 

whose meaning i given by 
Ab (Qoxvolx, a] <9 9 is infinite 

FO(Qp, 3.;;) is another logic, where in addition there are formulas of the form 
Gine 

where 


Ak Gaui 





nal le*[=17 


‘The notion of isomorphism closure makes sense for any logic, and it is now easy 
to establish 


Fact 19.2 If each quantifier in Q satsifies ISOM, then ismorphism closure holds 
for FO(Q). 


(To show this, one proves by induction over formulas something a little more 
general, namely, that iff: A= Band a,,...,a,€ A then 
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Ab olmy....a]69 BE glflah.-- fla] 
for all @ in FO(Q) of the relevant signature. Fact 19.2 is the special case of this 
when is a sentence.) 

‘One reason that ISOM is important is thus that Fact 19.2 needs to hold, or, put 
differently, quantifiers need to be Jagical constants. There has been some discussion 
as to just what logicality means, but it is generally agreed that isomorphism closure 
is at least a necessary condition: logic should be indifferent to which universe 
of objects one is talking about, It is “topic-neutral,” it cares only about structure 
[see chapter 6]. In the case of monadic quantifiers, there is a further reason, as has 
been seen: these particular logical constants care only about quantities of things, not 
the things in themselves. Hence the adequacy of the term quantifier. 

‘Are logics with generalized quantifiers int-order or not? There is a sense in which 
they are: they quantify only over individuals of the universe, not, as in second-order 
logic, over sets of such individuals [see chapter 2]. Thus, the notion of a signature, 
and the notion of a structure, are the same for these logics as for FO. However, the 
term first-order lagic has become synonymous with FO, and, in this sense, many of 
the logics introduced here are not firs-order, since their expressive power exceeds 
that of FO. 





19.7. Expressive Power 


Consider again the logic FO(Qn). Clearly this logic, and any logic of the form FO(Q), 
extends FO: everything that can be said in FO can also be said in them. Moreover, 
itis also clear that FO(Q,) is more expresive than FO; for example, as mentioned in 
section 19.3 one can say in FO(Q,) that every clement of AC has a finite number 
of predecessors, but one cannot say the same thing in FO. Or, to take a simpler 
example, one can say in FO(Q,), but not in FO, that the universe is finite: 


~ (Qoxix=s) 
To prove this, suppose there were an FO-sentence @ equivalent to ~ (Qox\(x= x). 


In FO one can write down, for every natural number m, a sentence @, saying that the 
universe has at least m elements. Consider the theory (set of sentences) 





T=(0)U (9.:=1,2,..01 

Now Thas the property that every one of its finite subsets has a model (why?). By 
the Compaciness theorem which holds for FO {see chapter 1, section 1.10], it then 
follows that Thas a model. But that is impossible: the universe of that model would 
be finite, yet have at least m elements for every n. Hence, ~ (Qyx)(x=.) is not 
equivalent to any FO-sentence. It also follows that the Compactness theorem does 
not hold for FO(Q,)- 

We take these intuitions as the way to compare the expressive power of logics. 
By definition, a logic 1’ extends a logic L, in symbols, LL’, if each L-sentence is 
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‘equivalent to — has the same models as - some sentence (of the same signature). 
‘Thus, every logic of the kind considered here extends FO. Moreover, L’ properly 
extends L, L<E;, if L=L’ and L’ £ L. The latter condition means that there is 
some [sentence which is not equivalent to any L-sentence. For example, 





FO <FO(Q,) (19.20) 


as just seen. Finally, Land L’ are said to be equivalent, L L’,if L= Li and 1’ = L. 

‘Note that equivalence between logics means same expressive power; it does not 
‘mean identity. Consider FO and FO(3.,;). These logics are equivalent: in FO it can 
be said, for example, that a set has exactly 18 elements: 


(Bs) =>» Bnd, (POS) & (5 8)) 8 (VIP) (= ))) 


[As can be seen, it takes 19 variables to say this. But in FO(G.,,) it is possible to say 
the same thing with just two variables: 


(39 PO) 8 (B.eox){ Pla) & (xy) 


Indeed, everything that can be said in FO(3..,) can also be said in FO, only it 
sometimes takes more variables. The number of variables used is important for 
certain applications of logic, but not for expressive power as defined here. 


19.8. Definability 


Showing that = L’ may seem like a substantial task: for each one of the infinitely 
many L-sentences, an equivalent L-sentence must be found. But when Lis of the 
form FO(Q), the task is usually much simpler: i suffices to show that each quantifier 
in Q is definable in L’. For example, once it is seen that the quantifier 3.yy is 
definable in FO, it is rather clear that any FO(3.,,)-sentence can be rewritten as an 
FO-sentence. And that 3.,y is definable in FO just means that the single sentence 
(Gpx)P(a) is equivalent to some FO-sentence, as of course it is 

To be precise: Suppose Qs a type (I, 1) quantifier. Qis sid to be definable in a 
logic L if the sentence 


(Qx/(2(8), Ps) 
is equivalent to some L-sentence of the same signature (in this case the signature 


[P, Py) consisting of two unary relation symbols). Similarly for quantifiers of other 
types. Now it is not hard to show 





Fact 19.3 FO(Q) = L iff cach quantifier in Q is definable in L. 


Look at some examples. It has been scen that FO = FO(3..,); hence 





447 





Dag Westerstih! 
FO(Q,) = FO(Qs, 3.) (19.21) 

since 3.,,, being definable in FO, is « fortiori definable in FO(Q,). 
FO(Q,) = FOI) = FO( more) (19.22) 


‘The first part holds since a se is infinite iff it has the same cardinality as some proper 
subset, 50 (Qos) P(x) is equivalent to 


(3x)(P(x) & (KPO), PY) & (v4 #))) 
‘The second part holds because (Ix) P,(x), P,(x)) is clearly equivalent to 
~ (omore x) P\(x), By(2)) Be ~ (more 99 Pal x), Pils) 


‘One can show that both of the inequalities in (19.22) are in fact strict. (These are 
examples of wndefinability results; more about that in section 19.9.) 


FO( most) S FO( more) (19.23) 
since (most x)(P;(x), P;(x)) is equivalent to 
(more x) P(x) & P(x), Pix) & ~ Px) 
‘Again, this isa strict inequality in general. But note that if X 0 Y is a finite set, then 
[X]>|Y]eo1X~ Y+ix0 Y]>1¥-x1+1X0 ¥} 
2 |X-YI>|¥-41 
‘So when (the interpretation of) P 1 P; is finite, (more x)(P;(x), P:(x)) is equivalent to 
(more »)\(P(s) & ~ Ps) v (Pi) & ~ Ps), (Pix) & ~ PAs) 
Let this last sentence be yj. Next, when X71) Y is an infinite set, then |X] isthe 
maximum of |X Y| and [X71 Y}, and likewise || is the maximum of (Y ~ X] and 
1X1 Y|, (These are facts of cardinal arithmetic.) It follows that, in this case, 
IX]> [Yeo IX Y1> [YX] and [2X Y]> 10 YT 
That is, when F,1 Py is infinite, (more x) P(x), P(x)) is equivalent to 
¥; 8 (mast 2 P(x), ~ PAX) 


Let the second conjunct above be yp. It now follows that, on any universe, (more x) 
(P(x), PAs) is equivalent to 


(~ (Qa) Pix) 8 Pala) & Yi) v (Qos) P(x) Be Pax)) & Yi 8 Ys) 
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Putting all of the above together, shows that 
FO( more) = FO(Qs, mest) (19.24) 
Al type (1) quantifiers are definable in terms of type (1, 1) quantifiers (but not 
vice versa); in fact, there isa uniform way of strengthening a type (1) quantifier Qto 
its so-called relativicarion, which is the type (1, 1) quantifier Q™ defined by 
(Rel) QYX¥HQ.xXNY 
Roughly, Q™ says (on any universe A) about X, ¥ what Q says on the universe X 
about XY. In other words, the quantification domain is restricted to the first 
argument of Q4. We have Q.X 2 Q' AX, that is, the following is logically valid: 
(Qx)P(s) +9 (QM (x=, P(x) 
Which means that 
FO(Q) = FO(Q") (19.25) 
Here are some examples of relativizations: 
ets all 
3a some 


G,,)"'= at last » 
(Qu)! = mort (Qx was defined in (19.14).) 





So, note that the Aristotelian quantifiers are relativizations of familiar type (1) quan- 
tifirs. In the fist three cases above, the relativizations are, in turn, definable from 
the unrelativized quantifiers, for example 

(all x) Py(x), PAs) © (HK P(X) > PA)) 

(some (Ps), By) #9 (BM PI) & Px) 


In other words, 





FO = FO(all) = FO(some) = FOG,,) = FO(3.) (19.26) 
However, interestingly, 

FO(Q,) < FO(maz) (19.27) 
Even on finite universes, in fact, saying that X1 ¥ has more than half the elements 


Of X is not expressible in first-order logic plus the quantifier saying that a set has 
‘more than half the elements of the whole universe, 
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19.9. Undefinability 


To prove that a particular quantifier Qis definable in some logic L, a definition must 
be provided, ic., a defining L-sentence. This can be more or less involved (as with 
the case with more, mast and Qo in section 19.8), but is often straightforward. To 
prove that Qis net so definable, however, is harder. Here one really needs to verify 
that none of the infinitely many L-sentences works as a definition. 

Sometimes one can manage by showing thar 1 has some property that it would 
not have if Q were definable. This is how it was shown that Q, is not definable in 
FO, using the fact that FO has the compactness property. But this is more of an 
exception; most logics lack compactness, or other similarly useful properties. There 
ate, however, more elementary and direct methods of showing undefinabilry, but a 
description of these falls outside the scope of this chapter. A thorough survey of 
(n)definability issues for logics with monadic quantifier is given in Vaininen (1997). 

‘Using these methods, it can be shown, for example, that the seemingly innocuous 
‘quantifier mast = (Q,)" is essentially type (1, 1) in a very strong sense: not only is it 
‘not definable from Qx, but: 


‘Theorem 19.4 masts not definable in any logic of the form FO(Q,,..., Qu). 
where the Q, are of type (1). (In fact, the same holds for all the properly 
proportional quantifiers.) Kolaitis and Vaininen (1995) 


19.10, Monotonicity 


‘Among the multitude of possible quantifiers, the ones that actually tum up in 
familiar logical or linguistic contexts often have characteristic properties. Logicians 
‘want to know if logics with generalized quantifiers are well-bchaved in various ways, 
for example if the compactness property holds for them (see section 19.7), or if they 
are complet, ic, if their sets of logically valid sentences are recursively cnumerable 
(can be axiomatized by a formal system). Unfortunately, many logics fail to have 
cither of these properties; examples are FO(Qz) and FO( mest); proofs of these facts 
can be found in Westerstihl (1989) 

But one may more simply just look at the properties of the quantifiers themselves, 
and then the perhaps most conspicuous ones are the monotonicity properties: 





© Atype (1) quantifier Qis upward monstonc, MONT, if for all A, 
QuX and XCYCA implies O,7 


Downward monotonicity, MONA, is defined correspondingly. 

‘Similarly, for type (1, 1) quantifiers one can talk about upward or downward 
monotonicity in the frst or second argument, and use MON with up- or down- 
arrows to the right and/or left to indicate this. For example, a type (1, 1) Qis 
{MON if, for all A, 
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QuX¥ and X°CX implies Q.X'Y 


And it is, say, TMONL if it is upward monotone in the first argument and 
downward monotone in the second argument. 


Now, looking at our examples notice that V, 3, 3s, Quy Qo, Qe are all MONT, 
whereas, say, 3., is JMON. A typical quantifier which is neither upward nor down: 
ward monotone’ is 3.,, but note that it is the conjunction of an upward and a 
downward one: 





3.3.83, 


So monotonicity is ubiquitous. Here, however, is an example of a thoroughly non- 
‘monotone type (1) quantifier: 


(Qoaa eX 69|X] is even 


‘As to our type (I, 1) quantifiers, ome and at least » arc TMONT, ne is LMONJ, 
everyis LMONT, more is MONS, and most is MONT but, a8 the reader can easily 
verify, not monotone (in either direction) in the frst argument. I is non-monotone, 
but, as shown in section 19.8, itis definable with Boolean operations from the 
monotone mare. And again, 3 thoroughly non-monotone type (I, 1) quantifier is am 
even number of = (Qe): 


19.11, Lindstrm Quantifiers 


Monadic quantifiers are, on a given universe, relations among subsets of that uni- 
verse, But the business of mathematics is generalization, and itis then only natural 
to consider quantifiers that are relations between relations over the universe. This 
concept was introduced in Lindstrom (1966), and is the official notion of a gener: 
alized quantifier in logic. Our earlier definitions easily carry over to this polyadic 
case, This can be illustrated with an example. 


‘A (generalized) quantifier of type (2, 1, 3) is a function Q which associates with, 
cach universe A a quantifier Q, of type (2, 1, 3) on A, ic., a temary relation 
‘between a binary relation over A, a subset of A, and a temary relation over A. 





Such a Qean again be seen as a variable-binding operator, such that 
(Qsyn) if @, y, @ are formalas, then 
(Q9, 5 wee, v. 8) 


isa formula (where all free occurrences of x, y in @ are bound by the quantifier 
prefix, and similarly for the other variables). 
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‘The meaning of this formula is given by the clause 
(Qsem) AF (Qu, 5 wwXe, ¥, Ola] if (9, wi, Om) ED, 
Here 
ot = {(b, JE A AE Gb, 6a) 


ete, So the logic FO(Q) is defined as before by adding these new clauses to the 
definition of a formula and of satisfaction, respectively. The reader can easily formu- 
late all of this for the general case of a quantifier of type (by... bd. 

‘The property ISOM is defined for such a Qin the same way as before (below for 
the type (2, 1, 3) case, so RE AY, XC A, SCA’, ete): 


(ISOM) If (A, R, X, S) (A RY X% 8°) then [Q, RXS 2 Q4R'X'S'] 


Fact 19.2 generalizes too, so ISOM quantifiers eam the right to be called logical 
constants, However, they no longer say anything about quantities, so the name 
‘quantifier’ should be taken with a grain of salt in the polyadic case, Consider some 
examples: 


D,XR.@ Ris a dense total ordering of X (type (1, 2)) (19.28) 
W,R €2 R is a well-ordering of the universe (type (2) (19.29) 


To express that R is a dense total ordering of a set X is easy in FO, so FO = FO(D), 
But the notion of a well-ordering is not expressible (as can be seen by a simple 
application of the Compactness theorem): FO < FO(W), 

Let Q, Qs, Qs be type (I) quantifiers. The next few examples illustrate so-called 
lifts of monadic quantifiers to polyadic ones; in this version they lift type (1) quan- 
tifiers to type (2) quantifiers. 


Ram(Q),R e 3XC A(Q4X & Va, bE X(a4 b= Ria, 8) (19.30) 
Br( Quy Qs)aR 29 3X, YS A( Qs) 4X 8 (Qs)4¥ & XX YC R) (19.31) 
Res QUR 2 QeR (19.32) 


In all of these cases, itis assumed that the lifted type (1) quantifiers are MONT. 
(Equation (19.30) is related to the so-called Ramsey Theorem; cf. any textbook of 
model theory.) 

The lift Br is an example of branching quantification. This idea originally stems 
from Henkin (1961), who noted that the linear order of the quantifiers V and 3 in 
FO imposes certain restrictions that can be avoided if a partial order is allowed 
as well. This is, in fact, another way of generalizing FO quantification. Consider 
the formula 
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(v1) 


(5 5 H) (19.33) 
(wean) 9% 


his is read ‘forall x there exists y and for all = there exists w such that 94, ¥, % 0)", 
where the y depends on x but not on z, and the # depends on s but not on x Such 
dependencies cannot be expressed in FO. For example, in 

(VANE NIMC», 5 m) (19.34) 
depends on x and s, and in 

(Va VEN 3yKSudo(s 5 #) (19.35) 


‘yand m both depend on x and = These dependencies appear clearly if (19.34) and 
(19.35) are rewritten by means of so-called Skolem functions, then (19.34) becomes 


(BPYAG\ VEX) 91x, Fis), & G(x, 2)) 

and (19.35) is equivalent to 
(AFYAGKVS\ Ve) 01x, Flx, 2), 5 Gls, 2)) 

‘The formula (19.33), on the other hand, has the intended meaning 
(BFYAGN WEN Vs) 9(x, Fla), & G(2)) 


‘The quantifier prefix in (19.33) is called the Henkin quantifier. It can be subsumed 
under the notion of generalized quantifier: define the type (4) quantifier Q" by 


QMR e> there are functions f, 9 82. for all a, bE A, R(a, fla), b,4(d)) 


where RC A‘. Then (19.33) is equivalent to (Q" xyem)p(x, x, 5, #). Other partially 
‘ordered quantifier prefixes with V and 3 can be defined similarly. Adding the Henkin 
‘quantifier already extends the expressive power of FO considerably. For example, 
‘one may show that Q4 and even more are definable in FO(Q"), 80 


FO < FO(more) = FO(Q") 
‘The proof of this observation (due to Ehrenfeucht) is too simple and too pretty to 


be left out here: we will express “there exists a one-one function F from P; to P,"; 
this suffices since it means that ~ (more x)(P,(x), Pi(x)). Consider the sentence 


(vx\3y) 


(x= 2) = (v= u)) & (PD PD) (19.36) 
(w2\3u) 
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By definition, this means 
(BF\BG\ Us) V2} (x= 2) = (Fx) = G2))) & (P(x) D PFs) 


‘The (universally quantified) frst conjunct gives, first (leting 2 be x) that Fx) = G(s) 
forall x, so F= G, and second, that ifx# = then F(x) # F(2), so Fis one-one, and we 
are done!) QED 
‘Now Barwise (1979) suggested that one may also consider branching of (certain) 
other quantifiers than V and 3, and the polyadic Br(Q,, Q,) is an example of this, 
which one could emphasize by writing the formula (Br(Q,, Qs)sy)@(x; y) a8 


Ie thus says thar there isa set X satisfying Q, and a set Y'satisfying Q, such that any 
pair (x, 9) with x X and y€ Y satisfies (x, y). The ‘order-independence’ of the 
lifted quantifiers here is witnessed by the fact thar the formula is equivalent to 


(Qy) 
(Qs) 


So, in fact, there are two ways of generalizing FO quantification: one is through 
the concept of a (Lindstrom) generalized quantifier (which, as noted, essentially 
‘occurs already with Frege), and the other is through relaxing the linear left-right 
order of FO logic. As seen, the latter can, for the case of ¥ and 3, be subsumed 
‘under the former. But there also arises the question as to whether one can ‘branch’ 
arbitrary generalized quantifiers. Barwise considered some cases of branching of 
MONT quantifiers, but he explicitly stated that another definition is required for the 
branching of MONJ quantifiers, and he also claimed that the branching of, say, a 
MONT and a MONJ quantifier “makes no sense.” In spite of this, others have tried 
to express the meaning of arbitrary partially ordered prefixes of arbitrary generalized 
quantifiers (Sher, 1997). It remains to be seen, in my opinion, whether these ideas 
yield a fruitful notion of (generalized) quantifier. 

‘The lft Res{Q), finally, is called the resumption (sometimes vectorisation) of Q. 
Looking at a binary relation R as a set of ordered pairs, Res(Q),R simply says about 
R what Q says about that set of pairs. For example, 


(9) 


Res(3..)( R) 2 |RY= 
iLe., Res(3,.)(R) says that R has at least m pairs. Likewise, 
Rex(QulAR) €1R]> [4° RI 


As one would expect, polyadic quantifiers have in general more expressive power than 
‘monadic ones. AS to the lifts, one can, for example, show that (Br(Q,, Q:)s9)P( 
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is usually stronger than the ‘linear versions’ (QQ) PU, 9) and (Q:y Q,2) PU) 
Indeed, Hella et al. (1997) prove that these lifts are exentially polyadic: 


Theorem 19.5 Br(Qs, Qs) is not definable in any logic of the form 
FO(Qj,.-.,.Q,) where the Q, are monadic, and the same holds for Ram(Qx) 





‘Undefinability results for polyadic quantifiers can be very hard to prove. An example 
is the result in Luosto (2000) that Rer(Qx) too is not definable from any finite 
‘number of monadic quantifies added to FO; this proof requires quite advanced 
combinatorics. 


19.12. Quantifiers and Natural Language 


‘The most obvious connection between (generalized) quantifiers and natural lan- 
‘guages is that many of these languages have a fundamental sentence construction of 
the form [[[QloulX hehvel¥lvelss OF, in diagrammatic form, 


s 


“N\N 


Ne vp 
/\ 
Det N 
[etl 
QxyY (19.37) 


‘That is, (declarative) sentences are often formed by a noun phrase (NP) and a verb 
phrase (VP), where the noun phrase consists of a determiner (Det) and a noun (N).? 
Both nouns (‘man’, ‘teacher’, hungry dog’, ‘student who likes a teacher',....) and 
verb phrases (‘runs ‘smokes, ‘likes Henry’, “gave a lower to some shop owner’... ) 
are naturally interpreted as sets, i¢., as subsets of the universe of discourse. There- 
fore, the determiner (‘every’, “no”, ‘most’, ‘at least three’, “several of Jobn’s ten’) can 
be taken as a relation between sets, -c., a8 a type (1, 1) quantifier (on the universe, 
but the Det gives a quantifier on cach universe, so it corresponds to a generalized 
quantifier in our sense). For example, 





No student likes Henry. (19.38) 
All but three teachers smoke. (19.39) 
Most hungry dogs are friendly. (19.40) 


‘Two thirds of Jobn’s friends are teachers. «agat) 
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Other types as well tum up in connection with natural languages. The expressions 
‘everything? and ‘something’ naturally correspond ro the type (1) ¥ and 3. More 
‘generally, NPs may be interpreted as type (1) quantifiers, so that, for example, ‘most 
students’ denotes the set of subsets X of the universe whose intersection with the 
sets of students contains more than half of the students, and ‘all but three dogs’ 
denotes the set of those X such that the complement of X with respect to the set of 
dogs has exactly three elements, etc. Recall also that proper names like ‘John’ can be 
taken as type (1) quantifiers (note that this example as well as the last two do not 


satisfy ISOM). 
‘One may see the type (1, 1 1) more shan ((19.4) from section 19.4) at work in 
the sentence 
‘More students than teachers smoke. (19.42) 


But also polyadic lifts appear in the context of natural language quantification; a 
survey of this can be found in Keenan and Westerstihl (1997). I will not go further 
into these matters here, but end with a few more words about the central type (1, 1) 
case, i, the determiner denotations. 

‘Given the vast number of mathematically possible ype (1, 1) quantifiers, a reason- 
able question is whether there are constraints as to which of these can be realized in 
natural languages. A prime observation is that the noun argument X in (19.37) 
plays a special role: it restricts the domain of quantification, This is borne out by 
ooking at actual examples: 


Exactly three dogs barked. (19.43) 


can be seen as quantifying over the subuniverse of dogs; the non-dogs of the 
universe are irrelevant for the truth or falsity of this sentence. Also, a special role of 
the noun argument is consistent with the syntactic structure of (19.37). An carly 
observation (in Barwise and Cooper (1981) and Keenan and Stavi (1986)) was that 
determiner denotetions are conservatioe. they eately 


(CONS) QuX¥ 62 Q.X XN Y, for all A and all X, YC A 


‘This means, in effect, that the part Y¥~ X in figure 19.1 (page 443) plays no role in 
the truth conditions of Q.,XY. This seems to hold for determiner denotations, but 
it does mot hold, for example, for the otherwise mathematically perfectly natural 
quantifiers J and more (section 19.4). And indeed, there do not seem to be any 
determiner expressions in natural languages which denote these quantifiers. 
‘There is one more aspect of domain restriction, however: the part A- (XU Y) 
should not matter to the truth conditions either. This can be expressed as the 
following condition of extension, ist proposed by van Benthem (1986): 


(EXT) FX, YCAC AY, then Q,X¥e2 OXY 


‘That is, what a determiner denotes on 2 given universe does not ‘change” if one goes 
to a larger universe. So, for example, there could not be a determiner blit, say, 
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which meant some on universes with less than ten elements, but most on larger 
universes (note that this quantifier would still be conservative), 

Now recall that the idea of domain or universe restriction was already defined in 
section 19.8 in terms of the notion of relativization. And indeed, conservativty and, 
extension together capture exactly the same idea: 


Fact 19.6 A type (1, 1) quantifier satisfies CONS and EXT iff it is the 
relativization of some type (1) quantifier. 


To sce this, check first that Q™ always satisfies CONS and EXT. In the other 
direction, any CONS and EXT & has a type (1) ‘counterpart’ Q defined by 


Xo Q4Ax 
Then 


QEXY € OX 1 ¥ (by (Rel) in section 19.8) 
62 Q4X XN ¥ (by definition) 
2 Q4X XA Y (by EXT) 
62. QUXY (by CONS) 


1 Q=Q". 

‘So quantifiers that are denoted by determiners in natural languages satisfy CONS 
and EXT, They also satisfy ISOM. The ISOM + CONS + EXT quantifiers form a 
natural class, but there have been several attempts to formulate further constraints oF 
‘Vinguistic universals’ that single out (important subclasses of) the ‘natural language 
quantifiers.” Prime examples here are the various monotonicity properties discussed 
in section 19.10. It may seem ~ and it has been suggested ~ that all (monadic) 
‘quantifiers occurring in natural languages are Boolean combinations of monotone 
‘ones. However, an apparent exception to this would be am even number of = (Quen). 
‘And it is true that one can show (this follows from a result in Vaaindnen (1997)) 
that Ques is not definable from MONT (1) type quantifiers, Perhaps surprisingly, 
however, it ir definable from the relativization of such a quantifier, in fact from a 
CONS, EXT, ISOM, and MONT type (1, 1) quantifier. To see how, note that 


am even number of .XY © |X0 ¥ is even 


Now define Q by 
aura [zones if] X]is even 
XAY|=2 if|X]isodd 


iis clearly CONS, EXT, ISOM, and upward monotone in the second argument. 
Notice then that if « € X, 
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QuXa} <2] X| is even 
But then: 
[XA | is even «9 XA Y= or 38 XO ¥ Q4XN Via} 
“That is, 
(an even number of x) P(x), Px) 
is equivalent to 
~ Ba( Pi(x) & Py(x)) v Bat Pix) & Pyle) & (Q y)((Pi(9) & PO), (Y= *)) 


Pethaps one could still argue that Qs ‘unnatural’ in some sense, but I will lave the 
matter here. 

In carly days of linguistic semantics, it was sometimes suggested that first-order 
logic, FO, suffices for the formalization of natural languages. This thesis can be 
refuted in many ways, I think, but perhaps the most convincing rebuttal comes from 
the theory of quantifiers. Certainly mest is a natural language quantifier, but as has 
been seen, even if one restricts attention to finite universes, itis not FO definable, 
indeed it is not definable from any type (1) quantifies (theorem 19.4), Essentially 
stronger logics than FO are needed to capeure the intricacies of quantification in 
natural languages 


‘Suggested further reading 


A dexaied exposition of most ofthe aspects of quantification touche on inthis chapter can 
be found in Westerstthl (1989). A more recent survey article, emphasizing the connection 
with natural languages, and in particular the occurrence of polyadic lifts, is Keenan and 
Westersth (1997). There are several technical papers on the expresve power of various 
quantifier; t would suggest Koats and Vasnnen (1998), Vaininen (1997), and Hella et a 
(1997), where the details are spelled out in an accemible way. The canonical collection of 
‘mathematical papers on logs with generalized quantifiers, nd more generally on logics 
defined in model theoretic way, is Barwie and Ferman (1985). A more philosophical 
Approach to the logic of quantiers can be found in several ofthe papers i van Bentbem's 
(1986) collection. All ofthe work cited sofa approsches quantiction fom logical point 
Of view. For thow interested in the various forms that quantication can takin the word's 
languages, Bach et al. (1995) isan invakiable source. The connection berwcen this more 
empl work andthe logic of quantification sil remains to be fll explored 
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1 Pree IF 
[X-YI>1Y=X} and X= I> LK YI 

then 
[X]=1X- ¥[> mae YAIY-XD=1YI 


(On the other hand, if 


IX-YIs|Y- x} 
then 
[X]=1X-¥]+1X0 ¥]S1Y-X1+1X0 Y]=1Y] 
And if 
IX-¥Is1xn ¥} 
then 


[X]=1X- ¥}+1X0 Y| 
1x0 ¥[+1x0 YI 
#1X0 Y1 (since XA Yi fini) 
iY) ep 


2 All of these phrases may in turn have intemal structure; in particular, noun phrases can 
‘occur in many diferent postions in a sentence. Also, quantification can be effected by 
other means than determiner, for example using adverbs ~ Iam just looking at the 
simplest case here. 
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Chapter 20 


Logic and Natural Language 
Alice ter Meulen 


Logicians have always found inspiration for new research in the ordinary language 
that is used on a daily basis and acquired naturally in childhood. Whereas the logical 
issues in the foundations of mathematics motivated the development of mathemat- 
ical logic with its emphasis on notions of proof, validity, axiomatization, decidability, 
consistency, and completeness, the logical analysis of natural language motivated the 
development of philosophical logic with its emphasis on semantic notions of presup: 
position, entailment, modality, conditionals, and intensionality. The relation between 
research programs in both mathematical and philosophical logic and natural language 
syntax and semantics as branches of theoretical linguistics has increased in importance 
throughout the last fifty years. This chapter reviews the development of one particu 
larly interesting and lively area of interaction between formal logic and linguistics ~ 
the semantics of natural language. Rescarch in this emergent field has proved fruitful 
for the development of empirically, cognitively adequate models of reasoning with 
partial information, sharing or exchanging information, dynamic interpretation in 
context, belief revision and other cognitive processes. 


20.1. Compositional Semantics 


‘This section examines the principle of compositionality that Frege introduced, and 
how it leads to what is known as ‘Montague grammar’ 


20.1.1. Frege’s pussles 


Gottlob Frege, the founder of modern logic, provided a core foundational principle 
for contemporary semantic theory by requiring that the meaning of an expression 
should be 2 function from the meaning of its parts and the way in which they are 
put together. This Principle of Compesitionalisy serves as a major methodological 
‘constraint on the interface between the syntax, which generates a (fragment of a) 
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natural language, and its semantics, which takes the form of either a recursive 
procedure to translate its expressions to formulas of a formal language or a set of 
rules that specify how its expressions are given meaning in a model. For a system 
of interpretation to be compositional, the syntax of the object language needs to be 
formulated in such a way that the semantic properties of each syntactic category are 
completely determined by it. 

Given this methodological requirement of compositionality on a theory of mean- 
ing and interpretation, the two semantic puzzles which preoccupied Frege (1952) 
still constitute major foundational problems that direct much of contemporary lin- 
Buistic and philosophical theories of meaning and interpretation. ‘The first puzzle 
concems the information expressed in identity statements with coreferential noun 
phrases (NP). The Fregean discussion is based on the question why 


Hesperus is Phosphorus. 20.1) 


‘once was an informative identity statement to the Babylonian astronomers, who 
learned from empirical observations of the stars that (20.1) was true, whereas 


Hesperus is Hesperus. (20.2) 


‘was never informative, even though (20.1) and (20.2) are both true statements and 
the NPs, all proper names, corefer to the one and the same object, the planet Venus. 
If coreferential expressions have the same semantic value, they must be substitutable 
for each other in any context without affecting its semantic value, But how can 
(20.1) then be informative, while (20.2), in which a coreferential expression is 
substituted, is completely uninformative? Ifa semantic theory is to account for such 
facts, it must allow for coreferential expressions to differ in semantic value. For this 
purpose, Frege (1952) introduced the fundamental distinction between the reference 
(Bedentung) of an expression and its sense (Sinn). Different proper names and other 
referential NPs may refer to the same object, but they could still differ in their sense, 
as sense determines reference and not vice versa. Identity statements are informative 
when they contain expressions with different senses, and they are true when their 
NPs are coreferential. Conditions of ‘informativeness’ hence cannot be identified 
with or reduced to truth-conditions. Perhaps there is more to the semantic value of 
an expression beyond its sense and reference, like its psychological associations, 
connotation or ‘color,’ but that part ofits meaning is, according to Frege, consid- 
cred subjective and should be disregarded in semantics, for it cannot be the source 
of public, communicable information. 

‘The sense of an expression determines its reference in different situations, but 
even when the reference of an expression in every situation is determined, this does 
not fix its sense uniquely. If one assumes that the reference of a sentence is its truth 
value, two sentences that necessarily have the same truth value in all situations, e.g. 


Robin won the race. (20.3) 
Everyone who did not compete or lost in the race has done something Robin did 
not do. (20.4) 
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still differ in their Fregean sense. Similarly two distinct logical tautologies which are 
both necessarily true may have different senses, and convey different information 
depending on the situation in which they are used. The way in which the sense is 
expressed also determines its meaning, which Frege called its way of being given 
(Are des Gegebenscins), or mode of presentation, 2 notion still requiring clarification. 
If the semantics of natural language is to account for coreference, inference, eff 
cient information sharing and reasoning, it should give a satisfactory analysis of 
the Fregean notions of sense, mode of presentation, and reference in compositional 
semantics, 

‘The second problem Frege presented as a central question to a theory of meaning, 
is related to the first one about informative identity statements. If such statements 
fr any other two statements with the same truth value are embedded as sentential 
complements of verbs, the resulting statements may differ in truth value. For 
instance, 


Robin believes that Hesperus is Phosphorus. (205) 
Robin believes that Hesperus is Hesperus. (20.6) 


(20.5) may be false, whereas (20.6) must be true, even if Robin knows nothing of 
Babylonian astronomy, or if he is not even aware of what the name “Hesperus” refers 
to, For Frege, this meant that sentences embedded in shar-clauses do not refer, as 
they ordinarily do, to their truth value, but refer indirectly, i¢., they refer to their 
customary senses and constitute opaque contexts. Substitution of coreferential or 
cextensionally equivalent expressions in such opaque clauses docs not necessarily 
preserve the truth value of the entire sentence. 

Ordinary predicate logic cannot account adequately for these puzzles, as two of its 
basic laws fail in such opaque contexts: 





Substitution of logical equivalents may not preserve truth value 
(li) Existential generalization may not be valid 


‘The first problem was explained above, and the second one means that one cannot 
infer from a referential expression used im an opaque context that it actually has a 
referent. If John believes that the spy is watching him, even if someone is indeed 
watching him, he need not be a spy. It may even be that no one is watching him, 
bbut John erroneously believes someone, whom he believes to be a spy, is watching 
him. If two sentences differ in their sense, they express different thoughts, as Frege 
‘would say. But to provide a full fledged compositional semantic analysis of these 
sense differences in terms of their information content still constitutes a major 
driving force of current research. It requires a satisfactory account of equivalence of 
“information content,’ sufficiently fine-grained to explain when a statement expresses 
rnew information to someone in a particular context, which depends in part on the 
information that is already available to him. 

‘Asa first attempt at formalization of Fregean senses, Carnap defined the intension 
of an expression as 2 function from a set of indices to the extension of the expression 
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(Camap, 1947). The indices could be given various kinds of interpretations; while 
CCarnap thought of them as indexing a set of states of affairs, more recent conceptions 
have been based on Kripke’s semantics of modal logic, ic., as possible worlds; see 
Kripke (1980 [1972] or van Benthem (1988). [See also chapter 7.] As a solution to 
the Fregean informative identity puzzle, Camap introduced the notion of an indi 
vidual concept for the intension of proper names and referring expressions. Coreferring 
NPs would differ in theit meaning, if they are interpreted as individual concepts by 
different functions from indices to individuals. Different occurrences of the same 
referring expression would, however, have to be interpreted as having the same, 
constant reference. The problem is now transferred to the problem of telling func 
jons apart, but this has a clear set-theoretic criterion: functions are identical just in 
case they assign the same values to each argument. One consequence of this set~ 
theoretic criterion of function identity, however, is that all mathematical and logical 
truths together with all analytical ones are interpreted on a par by the same constant 
function, and, as a result, they have the same intension and are stil substitutable in 
all contexts, even opaque ones. This leads to the problem of lagical omniscience for 
accounts of belief reports that rely on possible worlds: Assuming you are rational, if 
you believe any contingent truth, you must believe any of its logical consequences, 
including all tautologies or necessary truths (see chapter 9]; (Stalnaker, 1984). 





20.1.2. Montague grammar 


Montague grammar, developed in the late 1960s by the logician Richard Montague, 
extended Carnapian ideas dramatically, and provided a major step towards a com- 
positional theory of interpretation of ordinary language, since it specified in all 
required detail how the semantic value of a sentence could be computed from its 
syntactic derivation (Dowty et al., 1981; Partec, 1997). In the approach described 
here ~ called PTQ after the title of Montague (1974c) ~ compositionalty takes the 
form sometimes referred to as rule-by-rule compositionaliry. each syntactic rule is 
mirrored by a semantic one that specifies how the meaning of the input to the 
syntactic operation determines the meaning of its output. A central issue for the 
compositional theory of interpretation was to provide a uniform function /argument 
structure to both sentences with quantified NPs and sentences with simple referen: 
tial ones. Montague’s insight was to interpret every NP, regardless of syntactic form, 
as denoting set of properties of individuals, doing justice to compositionality, 
and reducing them if possible to the familiar Fregean interpretation using standard 
first-order representations of existential and universal quantifiers. Predicates deriving 
from the interpretation of verb-phrases (VPs) are then simply interpreted as proper- 
ties of individuals which either are or are not in the set of properties interpreting the 
NP in subject position. Forms of quantification in natural language which are known 
not to be expressible or definable in terms of the first-order logical quantifiers can 
similarly be interpreted by this higher-order notion of generalized quantifier. For 
‘example, most students is not first-order definable, since its interpretation requires a 
‘one-to-one mapping between two sets dependent on a well-ordering by cardinality 
[sce chapter 19] (Barwise and Cooper, 1981). 
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In PTQ, Montague defined the interpretation procedure for any expression of the 
natural language fragment in two steps: first, the translation of the syntactically 
dlisambiguated natural language expression to an expression of the formal language 
Of the intensional logic, in particular a typed higher-order lambda calculus, and, 
second, the interpretation of that formal expression in the model-theoretic seman: 
tics. Hence, in this framework, the formal language intervenes between the natural 
language and the specification of meaning in terms of truth conditions in a possible 
‘worlds model, Montague himself, however, emphasized that there is no real need for 
such an intermediate level of ‘logical form’ representing the meaning of an expres 
sion of natural language, and in another article (Montague, 19742), he defined 
the semantic interpretation directly as a mapping, from the syntactic structure of 
the English expressions to appropriate model-theoretic objects and functions. The 
reason why the PTQ approach with indirect interpretation via a formal language has 
prevailed over this direct approach is largely practical. It is easier to see, for instance, 
the quantifier scope dependencies as differences in the linear order of quantifiers in 
4 formula than to follow the notation of complex model-theoretic functions assign: 
ing values to variables and all of the alternatives of such assignments. It should, 
however, not be forgotten that formulas merely encode their semantic interpretation 
in model-theoretic terms. One can, in principle, characterize any number of languages 
‘hich can be used to intervene between the natural language and the model-theory, 
as if there were really a daisy chain of such interlocking levels of representation. 

It is well-known that the intensional logic originally used in PTQ as the language 
‘of ‘logical form’ could be replaced without loss of expressive power by a language in 
‘hich quantifiers could range over reference-points (worlds ata certain time) (Gallin, 
1975). This seems to have initiated a research trend of fleshing out the formal la 
{guage with different parameters which intially belonged to the model-theoretic realm. 
In the newest dynamic theories of interpretation elements of the non-linguistic 
‘context, as well as variable-assignment functions themselves may be quantified over, 
see section 20.3. 

‘A general theoretical question remains at this point whether a many-sorted 
first-order logic would ultimately suffice for the semantics of natural language or 
whether it still needs to be enriched with tools or techniques of a higher-order 
Jogic, as in PTQ. The desiderata of axiomatizability, decidability and completeness 
for the logic of natural language sil play an important background role in such new 
developments. 


20.1.3. The nature of meaning-postulates 


In PTQ, meaning-postulates are formulas of the intensional logic which are true, not 
oly in all possible worlds, but in all possible models. In other words, by defining 
a set of meaning-postulates one characterizes which among all logically possible 
‘models provide a plausible interpretation for the natural language interpreted. They 
‘were, for instance, required to capture the necessary truth of analytic statements, ic, 
statements which are always true due to the meaning of their descriptive vocabulary, 
such as 
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Bachelors are unmarried. 


But meaning-postulates were designed to do a quite diverse number of jobs, 8, 
the reduction of intensional formulas to extensional ones in contexts where existen 
tial generalization and truth: preserving substitutivity of coreferential expressions held. 
‘They also guaranteed that proper names are Kripkean rigid designators, ic, referred 
to the same individual even in opaque contexts. As meaning:postulates were only 
required to be well-formed formulas of the intensional logic, there was no constraint 
‘on the kinds of tasks they could be designed for. The Montegovian strategy was 10 
hhardwire all semantic structure one might ever wish to use into the model-theoretic 
rules and then weed it out by meaning-postulates whenever appropriate. For instance, 
‘even a simple extensional sentence like 


‘Mary walks. (207) 


required higher-order and intensional types of logical expressions, computing, its 
truth value at each possible world by determining of all properties which ones 
applied to a function from all worlds to Mary at each world. By making all functions 
total (i¢., defined for every argument of the appropriate type), PTQ requires one to 
determine of anyone who could walk, whether he did walk at cach possible world, 
before one could determine whether (20.7) was true. The inefficiency and comput 
tional intractability of admitting only total functions made this strategy rather un- 
attractive for any computational implementation of the inferences that PTQ could 
otherwise account for so beautifully, 





20.2. Towards a New Theory of Meaning and Interpretation 


‘This section fooks at some problems that arise in Montague grammar, most notably 
problems with anaphora, and some other questions that lad to different theories of 
‘meaning and interpretation, 


20.2.1. Anaphora in Montague grammar 


Anaphora is the common phenomenon in which the interpretation of an expression, 
such as a pronoun, is determined by its relation to another expression, its ante- 
cedent. It has commonly been thought that such pronouns function either like 
individual constants coreferring with their antecedents or else as variables bound 
by their antecedents. Neither approach, however, seems to apply to all cases, and 
anaphora has often proved a stumbling block to formal semantic theories. In PTQ, 
since all NPs are interpreted as generalized quantifiers, pronouns are bound by using 
rules which syntactically replace the first occurrence of an indexed pronoun in a 
sentence, whether in a common-noun phrase with a relative clause or in a VP, by the 
antecedent NP which then semantically binds any subsequent pronoun bearing the 
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same index in that sentence. These ‘quantifying in’ rules apply to any kind of NP ~ 
Proper name, existential or universal - and any complex sentence, nested relative 
clause or complex VP. If the NP introduced by such a rule is quantificational, its 
translation has scope over any scope-bearing expression already present in the sentence, 
‘common-noun phrase (CN) or VP quantified into. This technique provides a universal 
‘compositional method of accounting for scope-ambiguity in natural language by 
syntactic disambiguation. The traditional ambiguities of universal and existential NPs 
are thus treated on a par with the ambiguities in intensional contexts ~ the de re and 
de dicto readings of NPs (see chapter 7}. The example discussed in section 20.1.1 


John believes that a spy is watching him. 


‘would be generated by quantifying in both NPs ‘John’ and ‘a spy’ in order to say 
there actually was a spy watching John. But if the sentence allows that the spy was 
a mere figment of John’s imagination, the interpretation would proceed directly, 
‘composing ‘a spy" with ‘watch him’, and not allow for existential generalization, 
‘which would imply the existence of an actual spy. 

From a linguistic point of view, these rules of quantifying in overgenerate spurious 
ambiguities and fail to account for some essential differences in anaphoric potential 
of the three different kinds of NPs. Since the niles apply to any NP, for each NP in 
4 sentence there are always at least two syntactic derivations of that sentence, one 
direct and one indirect derivation using a quantifying in rule. Even for the extremely 
simple extensional (20.7), PTQ would generate at least two non-trivially different 
interpretations, one directly composing the subject with the VP and the other 
uantifVing in the subject into a proposition with a free variable. Semantcally these 
distinct derivations would be logically equivalent, so these syntactically driven ditfer- 
‘ences have no semantic effect. If this isthe price one has to pay for the compositionalty 
of the semantic interpretation, it would be relatively harmless. But despite its univer- 
sality, there are very natural interpretations of NPs in intensional contexts which 
cannot be accounted for by quantifying in rules. Examples of such sentences started 
‘emerging in the philosophy of language as early as 1962 (Geach, 1962) and are 
based con the fact that itis not possible to evaluate an NP at a possible, non-actual 
world, retain its value while accessing another workd. For instance, in (20.8) 





John tries to catch a fish and wants to eat it (208) 


fone would like to interpret ‘a fish’ de dicto, with the fish John is trying to catch and 
then use that fish as referent of the subsequent pronoun. For such a coreferential de 
dicto reading the quantifying in rule would have to be applied after the two intensional 
VPs are conjoined, which would produce only a de re reading, which is counter- 
intuitive. Further examples which demonstrate essential limitations on the technique 
of quantifying in are called e-type promouns. 
If Mary dates a guy her parents disapprove of, they will make his visit miserable. 
(209) 
Every woman who loves 2 man kisses him. (20.10) 





467 





Alice ter Meulen 


In (20.9), the pronoun ‘his’ refers to any guy Mary dates and her parents disapprove 
of, In (20.10) ‘him’ refers to any man loved by a woman, The readings PTQ will 
generate with such bound pronouns necessarily give widest scope to the existential 
antecedent NPs, contrary to our intuitions which tell us that ncither (20.9) nor 
(20.10) must be interpreted as being about a specific existing individual. 

‘Another objection already mentioned to the uniform treatment of NPs by the 
quantifying in rules is based in the fact that universal NPs in relative clauses cannot 
bind pronouns in the VP, but existential ones and proper names do; see (20.11)- 
(20.14). 


* A woman who kissed every man left him, (2011) 
‘A. woman who kissed a man left him. (20.12) 
‘A woman who kissed Jim, left him. (20.13) 
No woman who kissed a man left him. (20.14) 


Quantifying in universal NPs hence needs to be restricted in a principled way 
to prevent such bindings as in (20.11) to arise, but the PTQ rules are entirely 
unrestricted. In the generative linguistic literature a host of facts concerning, the 
difference in anaphoric potential of the three kinds of NPs has been reported, 
which any proper semantic theory of anaphora should take into account. Just to 
mention a few of the most interesting facts, consider the anaphoric dependencies 
in (20.18)-(20.17). 


His mother loves Joho, (20.15) 
His mother loves a/every man. (20.16) 
‘A. woman loves her. (20.17) 


Proper names allow for backwards anaphora, whereas existential or universal NPs 
generally do not, as is seen in (20.15) and (20.16), although the pronoun can still 
be bound by another antecedent or be interpreted deictically. In (20.17), PTQ 
would allow for a coreferential interpretation of the subject and the pronoun, 
if the subject were quantified in, contrary to our intuitions. In (20.18), inverse 
scope 


A flag was hanging from every window. (20.18) 


of the NPs shows that the syntactic linear order of the NPs may be inverse to their 
preferred semantic interpretation. Such clear linguistic facts concerning the different 
anaphoric potential between NPs should be explained in a satisfactory and universal 
account of anaphora, which departs more radically from some of the fundamental 
assumptions of variable binding in formal languages which PTQ inherited from its 
logical tradition, What must be revised is the intrinsic connection between scope and 
variable binding. 





468 


Logic and Natural Language 


‘The problem of spurious ambiguities was tackled first by Cooper (1983) by 
‘weakening compositionality in 2 precise and constrained way, His grammar was 
allowed to generate ambiguous sentences, and it did not include any quantifying in 
rules, Hence meaning was not completely determined by syntactic form and the 
grammar did not embody the rule-by-rule compositionalty of PTQ. Instead, the 
semantic interpretation must choose for any NP it evaluates whether to determine 
its semantic value immediately or to put it ‘in storage,’ placing it on hold in a stack 
of NPs whose interpretation is deferred to a later point. When a stored NP is 
retrieved for evaluation, it receives scope over everything that is already interpreted 
at that stage of the semantic interpretation. This NP-storage technique circum- 
vented one linguistic objection to quantifying in rules, but it still required over- 
generating in the syntax, where gaps or empty NPs are generated to be bound by 
wh quantifiers (formed, ¢.g., by words like ‘who’ and ‘which’), but may be filtered 
‘out in the semantics, ifthe quantificational structure is deviant. No appeal is made in 
‘Cooper's framework fo any syntactic notion of illformedness in such semantically 
uninterpretable strings. The problems with cross-world quantification and the e-type 
pronouns remain, however, since this framework has no means to keep track of 
information already obtained about the referent of a pronoun, lacking any notion of 
context and dynamic binding. 

‘The problem of logical omniscience mentioned in section 20.1.1 ~ which also 
applies to PTQ - demonstrated that the logical mechanics in predicate logic would 
require substantial revision if they were to simulate how contest, prior information 
and external situation of use might be used to draw inferences from given informa- 
tion. ‘The characteristic ropic-neutrality of logic is seen as one of the sources of the 
problem of logical omniscience in possible worlds semantics. A beginner in logic 
often experiences how difficult it is to rid oneself of the natural topicality and 
context dependent aspects of reasoning. For instance, learning the disjunctive law, 
disjoining a proven formula with any arbitrary formula, one must learn to consider 
formulas that may be completely irrelevant to the first one. To make more precise 
what it means for two sentences to be about the same topic, oF to be relevant t0 
cach other, a more sensitive notion of the informative content of a sentence in a 
context is required. Also the requirement of PTQ that all semantic functions be total 
should be relinquished, for definite referential NPs could fail to pick out a referent 
at some worlds or presuppositions of other expressions could fail, and sentences with 
‘uninterpreted constituents should be neither true nor false (pace Russell and vinat 
Strawson). 

‘Yet another problem of a more metaphysical nature faced possible world seman. 
tics. What are possible worlds, if not mere formal entities that serve to distinguish 
contingent from necessary truths? If an individual at two differeat worlds may have 
‘no properties in common, in what sense is it still one and the same individual? 
Kripke (1980 [1972]), along with other possible world semantcists reviving an 
Aristotelian essentalism, argued that some properties were essential to an individual, 
most notably the properties concerning its origin. Personal identity would only 
break down when such essential properties would be lost. Other possible world 
semanticists, especially David Lewis (1983), took the extreme opposite view and 
argued that individuals can never be the same across possible worlds, but are rather 
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related by a much weaker counterpart relation and need not have any common 
properties. The philosophical debate continues to be lively and provides a plethora 
‘of philosophical options on choices of primitives, views on identity of individuals, 
properties and propositions (Almog ct al, 1989; Stalnaker, 1984). But the need for 
possible worlds in the semantics of natural language is now also disputed by the 
theories of dynamic interpretation. Although intensional contexts and opacity phe- 
nomena obviously require tools beyond mere extensional first-order logic, one might 
reinterpret possible worlds in terms of possible states of information an interpreter 
can be in. Modality should accordingly range over possible updates of the given 
information, not over metaphysically possible states of the world. This epistemic 
‘turn of natural language semantics opens the way for dynamic interpretation, where 
the core concept is updating a given information state, by using a sentence in 
discourse to be interpreted as an instruction to add new information or constrain the 
‘given information in a particular way. By shifting away from the Fregean focus on 
truth functional meaning to a theory of informative content of sentences in dis- 
course, but stil characterizing various inferences as truth preserving operations on 
given information, the semantics of natural language is now merging the traditional 
issues of truth functional meaning with pragmatic issues of context-dependent inter- 
pretation, interpretation as action between people and sharing of information. 


20.2.2. Discourse 


‘The PTQ technique of quantifying in to obtain wide scope readings with bound 
pronouns was shown above to have some inherent shortcomings from a natural 
language point of view. Considering anaphoric dependencies that arise between 
sentences, one sees that no simple generalization of the quantifying in rules can ever 
account for such forms of binding in discourse. Binding of pronouns is not post- 
poned until one reaches the end of a sequence of sentences. It is rather a more 
‘dynamic process, where the interpretation of a sentence in a sequence is constrained 
bby what information is gained from preceding sentences and whatever common 
background is supposed, and in turn constrains the ways in which the information 
exchange may be continued. In discourse, universal NPs are again more limited in 
their anaphoric potential than either existential ones of proper names, as in (20.19)~ 
(20.1). 


“Every woman kissed a man. She left (20.19) 


In (20.19), the pronoun ‘she" cannot be referentally dependent upon the universal 
NP ‘every woman’ in the preceding sentence. Existential NPs, definite descriptions 
(of proper names may, however, corefer with pronouns across sentences, as in (20.20) 
and (20.21). 


A/The woman kissed a man. She left. (20.20) 
Jane kissed a man. She left. (20.21) 
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Since the PTQ quantifying in rules do not apply to sequences of sentences, this 
‘characteristic difference in anaphoric potential of universal versus referential NPs can- 
‘not really constitute a principled objection to quantifying in. But if these rules were 
‘generalized to apply to sequences of sentences, quantifying in would not always give 
the desired result. For instance, to generate a bound reading of 


Only one student is reading. He is sitting at the table. (20.22) 


the NP ‘only one student” should be quantified into the sequence ‘he, is reading, he, 
is sitting ar the table’. But then the interpretation is weaker than intuitively needed, 
requiring only that there be precisely one student who has both properties of reading 
and sitting at the table. It does not rule out other reading students, as (20.22) seems 
to require, Hence a generalization of the quantifying in rules will not make the 
‘correct predictions for bound pronouns in discourse (Gamut, 1991, ch. 8), But the 
similarity between binding pronouns across sentences as in (20.19)-(20,22) and 
within sentences as in (20.11)~(20.14) is striking. Any semantic theory of binding, 
should not only account for the difference in anaphoric potential of the three kinds 
of NP’, but also admit generalization to the interpretation of pronouns in discourse. 
‘This insight has been a driving force behind the development of the theories of 
dynamic interpretation discussed in section 20.3. 


20.2.3. The fallacy of misplaced information 


‘The information one may derive from interpreting an expression depends on a host 
of different parameters, Consider an utterance of a simple sentence in (20.23). 


My husband and I invited her for dinner today. (20.23) 


“The direct situation of use determines for instance the reference of indexical expressions 
like “I” and ‘today’, but common sense knowledge may be necessary to understand 
what a dinner invitation is, and linguistic knowledge will help determine who ‘her’ 
could refer to, Informative content arises as a relation between these parameters, the 
syntactic form of the expression used, its meaning, and the external world. Barwise 
and Perry (1983) consider meaning as such a dependency of many parameters. They 
stress that a sentence may be used in different situations to convey different informa- 
tion, which is why communication in natural language is so efficient. The sentence 
used in (20.23) could be used in a different situation to express completely different 
information, Sentences may be used to describe parts of the world, situations, and 
the reference of a sentence should be the set of such described situations, and not 
merely a truth value as Frege would have it. The meaning of the sentence partially 
determines which situations it can be used to describe. But other contextual para- 
meters come into play when interpreting the use of the sentence in a particular 
situation as giving information about a particular topic. To assume that the entire 
informative content of the use of an expression is determined solely by its interpreta- 
tion is what Barwise and Perry call the fallacy of misplaced information. 
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‘The performative hypothesis, popularized in pragmatic theories of meaning, such 
as Searle's (1970), proposed to analyze any sentence as subordinate to a performa- 
tive first person verb, ¢.g. 


It is raining. (20.24) 
T inform you that it is raining. (20.25) 


This isa clear example of the fallacy of misplaced information, as it attempts to put 
information about the situation of use overtly into the described situation, Similarly, 
the Russellian analysis of definite descriptions, which analyzes any definite descrip- 
tion as referring to any unique individual who satisfies the describing properties, is 
prone to the fillacy of misplaced information, since definite descriptions can be used 
to give information about the situation of use, which is distinct from the described 
situation, Denying that proper names can be used in a context to contribute to its 
interpretation, as many direct reference theorists like Kripke have claimed, is another 
instance of the fallacy. If | introduce myself, this is a meaningful communicative act, 
because the addressee receives the information how to call me, and knows henceforth 
how to refer to me, If names had no meaning beyond referring directly to their 
bearers, it would not be possible to explain how one does extract useful information 
from such an introduction. Informative identities are informative because the two 
coreferring expressions each contribute a different property of being so named, If 1 
use (20.26) t0 report to you Jane’s belief that her husband is happy 


Jane believes Jim is happy (20.26) 


[invite you to infer, by an implicature, revocable upon further information, that 
Jane herself would report this belief using the proper name Jim, If I had used 
instead of (20.26), 


Jane believes her husband is happy. (2027) 


a different implicature would be invited. But both (20.26) and (20.27) may be true 
even when Jane denies that Jim is her husband. A Fregean theory of reference could 
never account this, as it avoids context-dependent parameters of language use. In 
1 dynamic theory of interpretation the shifting reference of indexicals and demonstra- 
tives should be accounted for in constructing a context from which conclusions may 
be drawn which could also contain indexical expressions. The central assumption of 
classical logic, that context dependence should be avoided or eliminated, is no 
longer viable, once reasoning of human interpreters in natural language has become 
the target of investigation. 


20.3. Theories of Dynamic Interpretation 


‘This section presents some alternatives to Montague grammar. These are theories of 
a more dynamic interpretation. 
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20.3.1. Discourse representation theory (DRT) 


DRT was developed by Kamp (Kamp, 1984; Van Eijck and Kamp, 1997) partly in 
response to the anaphora problems Montague grammar was facing. But it was also 
‘motivated by a more philosophical concer with the nature of reference, meaning, 
inference and interpretation. Independently, Heim (1982, 1983) developed a closely 
related theory of dynamic interpretation, File Change Semantics. The main ideas 
and concepts of DRT are presented here in the context of the presentation of 
natural language semantics. (See the Suggested Readings for more comprehensive 
expositions of the theory.) 

‘The core claim of DRT is that interpretation should be considered a dynamic 
process, in which discourse representation structures (DRSs) are constructed repre 
senting the information and the anaphoric dependencies expressed in a sequence of 
sentences, Such information is true in a model just in case there is a structure: 
preserving embedding of the reference markers verifying the descriptive conditions 
relating them, which constitute the representation, into that model. The conditions 
in the representation arise incrementally from the interpretation of the sequence of 
sentences by application of the construction rules for DRSs. A condition is a prop- 
erty of relation with an appropriate number of reference-markers as arguments; 
these function in certain respects like context dependent referring variables. DRSs 
consist of different levels of conditions, where a reference marker is accessible from. 
lower level only if it is declared at a higher level. Negation, modality, conditionals, 
and quantifiers create deeper levels of embedded structure in the DRS, 

‘The DRS-construction rules require that a proper name introduces a reference 
‘marker in the top level of the representation, which remains accessible to any lower 
level, thus capturing the semantic property of names that they always take widest 
scope or refer rigidly to one and the same referent no matter what context they 
‘occur in, An indefinite NP introduces a new reference marker into the given level, 
which may be a lower one, and the predicate in its CN is attached as a property of 
the reference-marker. Definite NPs are treated differently; they must be identified 
with an accessible reference-marker present in the given or any higher level of the 
representation. Since pronouns are definite NPs too, their reference-marker is unified 
with the accessible reference-marker of their antecedent NP. Clauses with universal 
NPs force a split of the DRS into two levels, where the information in the universal 
NP is represented as a property of a new reference-marker in the first deeper level, 
and the information expressed in the remainder of the sentence by conditions in a 
new, deeper, subordinate level. The embedding conditions of such a split of levels 
requires that every verifying embedding of the conditions in the first deeper level can 
be extended to a verifying embedding of the conditions in the next decper level 
Some illustrations of these DRS-construction rules are given below, along with an 
analysis of (20.10), which constituted a problem for PTQ's account of anaphora, 





‘A man came in. He sat down. (20.28) 


An indefinite NP is represented by introducing a new reference marker x and attaching, 
the CN as a property of x, and representing the remainder of the sentence as a 
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property of too. The pronoun which is anaphoric to the indefinite NP is represented 
by its own reference marker y and identified with x, the marker for its antecedent. 





ca] 
‘an (3) 

come in (x) 
sit down (3) 
yee (20.29) 














‘The DRS in (20.29) is true in a model M=(D, I) (where D is a domain 
Of individuals and I the usual set-theoretic interpretation-function assigning sets of 
tuples to mplace predicates) iff (if and only if) there is a verifying embedding 
_F mapping s and y into the same individual in D, and fix) € I(man), fx) € Keome 
in), f(y) € Tisit down) 

If the antecedent in (20.28) were a singular definite determiner, as in (20.30), 


‘The man came in, He sat down. (20.30) 


the construction of the DRS would essentially be the same, with the sole difference 
that the reference marker for the definite description should already be available 
either in the DRS representing preceding sentences, or as part of the assumed 
common ground of the discourse 

A universal NP gives rise to a split of the DRS into a top level containing reference 
markers for proper names and all referential NPs, if any were so far represented, a 
first deeper level which represents the CN in the universal NP, and a second deeper 
level which represents the VP of the sentence. Sentences to be represented after the 
split are processed at the top level above the split structure. 


Every man came in. * He sat down. (20.31) 


‘The first sentence in (20.31) is represented as: 








¥ 
‘man (x) | come in (x) 














‘The second sentence should be represented as a condition of x, but due to the 
structure of the levels xis inaccessible from the top level, where the second sentence 
is to be processed. So the bound variable reading of the pronoun in (20.31) is 
excluded, although a deictic reading is stil available. 

Indefinite NPs in the restrictor of universal NPs which bind pronouns in the VP 
formed a major problem for 2 PTQ style quantifying in account of binding. In 
DRT, such anaphora are accounted for by the accessibility conditions between levels 
Of the DRS. Sentence (20.10) is represented according to the construction rules for 
the DRS as in (20.10a), 


Every woman who loves a man kisses him. (20.10) 
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Ry 
‘woman (x) | kiss (x, 9) 
love (x, 9) 
man (9) (20.103) 














With the embedding conditions for a subordinating construction (20.10) is inter 
preted as true in any model where any man loved by any woman is kissed by her, no 
matter how many men each woman loves, 

Decictically used referential NPs are directly referential to an individual in the 
immediate situation of use. In DRT, such a directly referential link is represented by 
an external anchor, which is an ordered pair consisting of the reference marker for 
that NP and some object in the immediate situation of use. Such external anchors 
are themselves not parts of the DRS but rather constrain the set of verifying 
‘embeddings of the DRS into the model. The semantic content of a deictically used 
referential NP is completely determined by the associated external anchor, but the 
information someone may obtain from the use of such an expression is partly 
dependent on the form of the NP itself 


20.3.2. Situation semantics (SS) 


$$ (Barwise, 1988; Barwise and Perry, 1983; Seligman and Moss, 1997) is a theory 
(of dynamic interpretation which does not rely on a syntactic level of representation 
for anaphoric dependencies. Instead, information structure is constructed from se: 
mantic objects, which may or may not be parts of the actual world. Meaning arises 
as a relation between linguistic expressions, the context of use (including time of 
utterance, speaker, audience, location), linguistic and logical constraints, and the 
external world. Despite this important difference with DRT, the two theories are 
significantly similar in the insights and logical tools they offer to linguistic analyses. 
‘The primitive objects in SS are m-place relations, individuals, locations and polarities. 
‘They constitute events or situations, e.g. (J, (walk, Mary), 1)) represents a sitaation 
of Mary walking at / and, (J, ((kiss, Mary, John), 0)) a situation at Jin which Mary 
does not kiss John. Indeterminates or parameters act like reference-markers for 
locations, relations, individuals and polarities, and are equally constituents of situ- 
ations. They are assigned appropriate values by partial assignment functions, or by 
context-dependent speaker connections to parts of the external world. 

For example, a definite description can be interpreted as referring to an individual, 
determined by the speaker-connection, which is customarily called a referentially 
used definite description. A definite description which is so used to refer to an 
individual does not require its descriptive properties to be true of the referent. 
‘This is commonly recognized to be possible, when speaker reference is at stake. 
In SS, this usage is called the malue-laden use of definite descriptions. But a defini 
description can also constrain a situation, picking up an individual to contribute 
to another situation. This is called the value-free, or attributive use of a definite 
description. 


‘Thus, one can use the NP in (20.32) 
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‘The woman in the red skirt is tall, (20.32) 


to refer to Mary wearing a red skirt in situation « For such a value-laden interpretation 
tone fixes the resource situation s, the speaker connections ¢ and represent it 3 


‘adlthe woman in the red skiet}|(#) = Mary 


‘Third party reports of what was said may, of course, use other NPs to continue the 
reference to Mary, ¢.g. if Mary is also reading in 5 coreference is established with 





She said thar this reader is tall. (20,33) 


In the attributive use of this definite description, the interpretation is a relation 
between situations and individuals, whoever fits the descriptive properties. The con: 
dition of being tall is stil a constituent of the interpretation of (20.33), but none of 
the individuals is, To achieve the attributive use, the describing properties are not 
constituents of the interpretation, and it picks out an individual, if the resource 
situation contains an individual who satistis the properties. 

Other uses of definite descriptions are still possible, like in appositive clauses, 
where their reference is already determined by the context, as in 





John, the neighbor I play tennis with, is a nice guy. 


and the description contributes new properties to it, or functional uses, where 
reference is made to the role itself, not to whoever plays the role in any given 
situation, as in 


‘The next president must be elected. 


In evaluating any sentence containing a descriptive referential NP, one has to be 
particularly careful in determining which situation is described, and cannot, in gen- 
‘ral, conclude that its truth value remains the same, if considering a larger situation 
Cf which the situation described is part or another situation which does not contain 
the individual referred to as constituent. 

In SS, anaphora and other dependent NPs are interpreted dynamically by incre 
mentally extending partial assignment functions. The core idea here is that the 
interpretation of an NP in a given context is an action which may affect the context 
in a systematic way. Current research is focusing on the details of an inductive 
definition of such dynamic interpretations of expressions of all categories as context: 
‘changing actions and its relation to the standard static satisfaction conditions of 
‘ordinary predicate logic. SS relies on indexing rules which operate on parsed sen- 
tences before their interpretation, such that every NP bears 2 unique referential 
index and every dependent NP is coindexed in subscript with its antecedent super- 
script. The interpretation of these indices form a crucial part in the dynamics of the 
procedural interpretations. 
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‘An important SS construction of semantic objects needed to interpret universal 
NDS are parameterized sets. They are semantic objects in which certain constituents 
are sill undetermined, i., the parameters which need to be determined by exten 
sions of a given assignment function. Thus, to interpret our old example of e-type 
anaphora 


[Every woman who loves [2 man} kisses (him)? (20.10) 


a parameterized set X needs to be formed which contains all pairs consisting of an 
individual a and an assignment function g such that kissing holds between # and the 
object a(j). All such 4 are supposed to be defined on the same indices. Then, 
(20.10) is verified on this parameterized set, given an initial assignment function f, if 
for each every a in its set X,is a woman who, loving the man g(), kisses him; see 
Barwise (1987) for more discussion and detail 


203.3. Quantification and anaphora 


Further discussion of anaphoric binding for theories of dynamic interpretation leads 
to the most recent developments and open research problems 

‘One central isue is the interpretation of plural anaphora. They can be bound by 
singular universal antecedents, as in (20.34) 


Every woman kissed John. They left him. (20,34) 


‘The semantic operations with which the group consisting of all the women who 
kissed John is constructed as an appropriate referent for the plural pronoun in 
(20.34) is a central question of research. The converse of this isu is illustrated in 
(20.35), where the plural pronoun has a numerically appropriate plural antecedent. 


All women gathered in the room, They were wearing a badge. (20.35) 


‘The antecedent is an argument of a collective verbal predicate, denoting a property 
which can only be attributed to groups, not to the individuals constituting, the 
‘group. The pronoun is, however, an argument of a distributive predicate denoting a 
property true of each member in the group of women. Some semantic operation is 
required to divide the group of all women as a single unit into the set of individual 
women (Lanning, 1997) 

Binding. of plural anaphora by an antecedent in the scope of a universal NP 
‘cannot cross sentential boundaries, as (20.36) shows. 





Every father of two children sends them to Montessori school. * They love it. 
(20.36) 


Furthermore, two occurrences of plural anaphora bound by the same antecedent can 
be interpreted as referring collectively and distributively within one sentence as 





47 





Alice ter Meulen 
Mary and John invited their parents ro their place. (2037) 


iMustrates. The interpretation of (20.37) which is intended here makes the first 
anaphoric reference to the parents of each of Mary and John, but the second 
anaphoric reference 10 the place where they live together. Such issues of collective 
and distributive reference and predication provide a wealth of new puzzles for 
natural language semantics, which seem to lend themselves very well for analysis in 
these dynamic theories of interpretation that allow for a specific part-whole structure 
‘on their domains of reference-markers; see especially Kadmon (1987), Landman 
(1996), and Roberts (1987, 1989). 

A third important problem for theories of anaphoric reference is called the ‘pro- 
portion problem,’ illustrated by (20.38). 


Most women who love a man kiss him, (20.38) 


‘The DRT analysis seems to predict (20.38) is true in a situation in which Jane, who 
loves Jim, does not kiss him, Paula, who loves Peter, does not kiss him, but Edith, 
who loves Eric, Eduard and Evert, kisses the three of them, The quantification 
merely counts the cases of a woman and a man loved by her, and counts Edith three 
times in verifying instances, whereas Jane and Paula are counted each only once in 
two falsifying instances. Solutions to this proportion problem have been proposed 
using the S$ notion of parameterized sets, which suggest clearly that the dynamic 
interpretation should be constructed from the interpretation of expressions and 
‘constituents in all syntactic categories. 


20.34. Dynamic Montague grammar (DMG) 


‘The question remains whether the determinism of the syntax-semantics interface, as 
‘was required by the Fregean Principle of Compositionality, should be adhered to. It 
is clear that Montague’s rule-by-rule compositionality is not adhered to in DRT, for 
there is only one syntactic rule putting determiners and common nouns together 
into noun phrases, but there are at leat four different rules of DRS construction for 
NPs, depending on whether itis a proper name, a pronoun, an indefinite, existential 
‘or a universal NP. 

Critical of DRT for abandoning the Compositionality Principle, but overall moti- 
vated by much the same evidence, Groenendijk and Stokhof (1991) developed 
DMG as a compositional theory of interpretation of discourse. An update of a given 
information state relates partial assignments to variables, constituting, the current 
information state, to their continuations, preserving the prior assignments and add- 
ing values to new variables. Besides the customary context-independent variables, 
the formal language is enriched with discourse variables, comparable to the reference 
markers of DRT, functioning like context dependent names that create dynamic 
bindings across sentential boundaries. Logical constants may be interpreted either 
‘dynamically or statically, depending on their desired degree of stability across updates 
Of the information states. Dynamic conjunction, for instance, is sensitive to the order 
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Of presentation, and hence not commutative, as the ordinary static conjunction is in 
first-order predicate logic 

‘A major bone of contention isthe need for a pre-semantic representational level at 
which anaphoric dependencies are captured. DRT claims that such a syntactic repre 
sentational level is essential, whereas SS and DMG claim they do better without. 
‘Compositional reformalizations of DRT have been presented, and alternative dynamic 
systems are being explored (Muskens, 1995; Muskens et al., 1997). The arguments 
are far from conclusive and an ultimate assessment of these issues must depend on 
the development of much more substantive and detailed semantic analyses of various 
linguistic phenomena. 


20.4. The State of the Art 


This concluding section looks at some of the areas of research in logic and natural 
language today. First, some of the open problems that command attention are 
discussed, and then there is a brief glance at how developments in natural language 
‘semantics apply to cognitive science. 


20.4.1. Open problems 


‘The great deal of attention devoted to anaphora in natural language semantics has 
spurred generalizations of such informational dependencies in other categories than. 
NPs. Partee (1973) pointed out that tenses function very much like pronouns, in 
that their temporal reference can be determined deictically by the non-linguistic 
‘context, or depend on a referential, existential or universal antecedent. Sometimes 
these antecedents are adverbial, but they can also be verbs themselves. 

‘A second analogy between NPs and VPs is commonly recognized. Mass NPs as 
‘vome gold,’ ‘more peace,” “all furniture’ are seen to be analogous to certain kinds 
of descriptions of events, since both may contain parts of the same kind, ¢.g. part 
of some gold is gold and part of an event of John walking is also an event of his 
‘walking. Count NPs are on par then with event descriptions which include some 
inherent endpoint, like the NP, whose denotation does not contain the same man as 
part, and 


John walking a mile 


whose denoted event does not contain another walking of a mile by John. These two 
analogies play a very important role in developing a compositional semantic theory 
of tense and aspect, of temporal reference and quantification. Such a theory has obvious 
‘consequences for philosophical views on the nature of events and their identty- and 
individuation-conditions. Hinrichs (1986) and Parce (1984) provide accounts of 
nominal and temporal anaphora using tools of DRT theory, but the topic has grown, 
into a fruitful field for interdisciplinary research in logic and linguistics. 
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Another important area of current research isthe semantics of generic expressions. 
‘There is an important distinction between generic statements which refer to a kind 
as an abstract object and statements which are essentially of quantificational form 
binding cases by a default operator. The two kinds of generic statements ae illustrated 
in (20.39) and (20.40). 


Elephants are rare. (20.39) 
Elephants have valuable teeth. (20.40) 


‘The main semantic difference between reference to kinds and default quantification 
is that only the default quantification allows for exceptions, ¢.g., for (20.40) an 
elephant whose teeth have been cut off. Much linguistic evidence supports the 
distinction, and it is especially interesting to study the interaction with anaphora, 
Generic statements with universal NPs scem ro allow binding of pronouns across 
sentential boundaries more easily, as in 


Every player chooses a pawn. He pats it on square one. (2041) 


Further observations that form explananda for natural language semantics are bindings 
which change the referential type as in (20.42). 


‘There is a beaver in the ereek. They build dams. (20.42) 
In (20.42) there is first reference to an individual beaver, but this serves as anteced: 


ent of a pronoun which refers to the entire species. The converse dependency 
possible too, although it appears to be more restricted as in (20.43). 





Beavers build dams. 1 saw one/*him in the creck. (20.43) 


‘A systematic account of such type-changing bindings is a topic of much current 
research (Carlson and Pelletier, 1994). 


2042. Cognitive science 


‘To conchide this assessment of developments in natural language semantics, some 
questions about the entire research program should be addressed from a more 
general perspective. The renewed contact between logical theory and linguistic analysis 
Prompts the question what kind of theory of inference semantics is after. Should it 
be a theory about the inferential abilities of idealized, competent users of a natural 
language, or should it be a theory of actual inferences exhibited in human lingui 

bochavior? Frege’s abhorrence of psychological interpretations of logical laws had 
promoted a stark separation of logical and psychological research on inferential 
processes. Most psychologists nowadays are still apt to point out that abstract 
mathematical laws do not explain their actual data, because ‘people are not rational,” 
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‘human beings are no machines’ or ‘error is only human.” Yet the program of 
‘modeling inferential processes in natural language understanding by abstract logical 
representations has certain explanatory claims in cognitive science as a substantial 
contribution to a general theory of human cognitive capacities. 

Here the classical Chomskyan distinction between a theory of competence and a 
theory of performance can clarify this apparent conflict. As a theory of inference, 
natural language semantics disregards the parameters of individual variation, cases of 
inferential failure, and normalizes its concepts by abstracting from actual practice 
and performance. Its empirical base is essentially the intuitive judgments of its 
Users, not measured in 2 quantitative manner. Prychological theories of cognitive 
capacities, however, are rooted in experimentally gained evidence from actual, quan- 
titatively measurable inferential behavior. As in any science, they too make funda 
‘mental assumptions about their subject matter, excluding certain parameters in the 
experimentation as irrelevant to their explanatory goals, and stabilizing the context 
of their experimentation by a host of erteris paribus clauses, which rarely receive any 
independent justification. Both forms of theorizing are empirical in nature, essentially 
falsifiable, and have genuine predictive power. But they contribute to our under- 
standing of human cognition at quite distinct levels. A theory of error in linguistic 
processing is immediately relevant and perhaps even part of a psychological theory of | 
inferential processes, but it would not be of immediate interest to natural language 
semantics, But just a8 aphasia studies can provide us with arguments concerning the 
modularity of the brain and its cognitive functions, so t00 a theory of inferential 
failure may be able to provide evidence conceming the modularity of the brain for 
inferential processing and the interference with other cognitive functions. 

If the two kinds of cognitive theory are seen contributing explanatory insights at 
diferent levels, they can be considered respectively as characterizing the algorithms 
of inferential processes and characterizing the actual implementations of such algo- 
rithms in the human wetware. Nevertheless, itis important to emphasize again that 
both areas of research regard inferential processes as a central theme in a theory of 
‘human cognitive capacities 


Suggested further reading 


‘The Handboot of Lagic and Language, cited by J. van Benthem and the author (1997), isan 
excellent resource for research in the interface of logic and linguistics; this hefty volume offers 
| comprehensive review of the state of the art as of 1996 in the logical aspects of syntax and. 
semantics of natural language. The Handioat of Contemporary Semantic Theory, edited by 
S. Lappin (1996), is also an authoritative survey of selected topics in natural language seman 
tics. Language: An Invitation te Cagnitive Science, Volume I, edited by L. Gleitman and M. 
Liberman (1995), is the first volume of a set that offers a compechensive and well-balanced 
introduction tothe emergent inteniscplinary fekd of cognitive xience. Meaning and Grammar: 
An Introduction to Semantics, by G. Chierchia and S. McConnell-Ginct (1990), i an elementary 
textbook of linguistics, teaching basic concepts, tools and results of semantic theory in the 
tradition of model theoretic semantics derived from Monatague grammar; provides plenty of 
‘good exercises for hands-on practice in semantic analysis. L. F. T. Gamut’s (1991) Lapie, 
Language and Meaning (two volumes) isa basic textbook of first-order logic written for an 





asi 





Alice ter Meulen 


audience of philosophy students; it also inclades modal logic and higher-order intensional logics 
With application to natural language in Montague semantics. English Grammar: A Generative 
Perspective by 1 Haegeman and J. Gueron (1999), is a compechensve textbook of English syntax, 
With some relations to semantics and phonology; from the point of view of generative theory. 
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internationally distinguished logicians, philosophers, computer scientists, and linguists 
~ provide comprehensive studies of the concepts, motivations, methods, formal 
systems, major results, and applications oftheir subject areas. 





The Blackwell Guide to Philosophical Logic engages both general readers and 
experienced logicians and provides a solid foundation for further study. 


LOU GOBLE is Professor of Philosophy at Willamette University, He has published 
‘numerous articles on philosophical logic in various anthologies and journals such as 
Journal of Philosophical Logic, Logique et Analyse, Notre Dame Journal of Formal Logic, 
and other more general philosophy journals. 
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